Taxonomy and Performance Evaluation of Hybrid Beamforming for 5G and Beyond Systems

Increasing demand for higher data-rate wireless connectivity with lower latency is fueling the explorations of millimeter-wave (mmWave) spectrum and massive MIMO communications. Both technologies are recognized as the key enablers of 5G and beyond 5G (B5G) networks. Hybrid beamforming is one of the most promising energy and cost-effective approaches to realize mmWave massive MIMO communications with lower complexity and smaller training overhead. With the motivation of giving more insights and in-deep technical recommendations to B5G network designers regarding hybrid beamforming, we present a hybrid beamforming taxonomy in terms of channel state information (CSI) availability, frequency bandwidth, architecture complexity, analog beamformer components, number of users, connectivity to RF chains, and the digital and analog beamforming design. Furthermore, we provide a comprehensive survey on the state-of-the-art use-cases for each classification followed by identification of the future challenges and open research issues.


I. INTRODUCTION
Fifth-generation (5G) wireless networks have recently gained considerable attention from academia and telecommunication industry as the cornerstone of future communication networks and smart societies. Recent reports reveal that the wireless communication traffic is doubling annually and overtaking the wired communication traffic [1]- [3]. Moreover, the demand for wireless broadband and contentrich services has also grown as a result of the emerging bandwidth-hungry applications such as cloud gaming, vehicle-to-Everything (V2X) communications, industrial automation, remote health services, augmented reality, hologram services, smart city applications, and smart homes [4], [5]. The evolution of the internet of things (IoT) is also fueling the need to support massive connectivity of devices with ultra-reliable and ultra-low-latency communications (URLLC) to enable delay-sensitive and mission-critical The associate editor coordinating the review of this manuscript and approving it for publication was Irfan Ahmed . services that require very low end-to-end (E2E) delay such as tactile internet, remote control of medical or industrial robots, and real-time traffic control [6]. It is predicted that roughly 50 billion connected devices will be served by 5G networks by 2020 [6] with an average of 6 devices per individual.
Additionally, besides its ability to support a large number and different types of communicating devices, 5G is expected to provide many services with different traffic features such as different quality of service levels (e.g. latency and data rates), varying mobility levels, different types of data (e.g. Internet protocol (IP) and non-IP data), and multiple traffic models (e.g., burst traffic, high throughput traffic, delaysensitive traffic, and non-real-time traffic). The analysis of these specifications implies that performance guidelines for spectral efficiency, latency, system capacity, and data rates will require new supporting technologies to address such communications. As summarized in Table. 1, it is shown in the literature [7] that these specifications can be realized through the adoption of three main technologies: network densification to increase area spectral efficiency (i.e. more VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ TABLE 1. 5G performance criteria and its corresponding enabling technologies [8]. nodes per unit area per Hz), utilizing the mmWave band to increase the bandwidth (i.e. more Hz), and employing massive MIMO to improve spectral efficiency (i.e. more bits/s/Hz/node). The adoption of these three technologies will also provide a substantial accumulative gain in the area spectral efficiency measured in (bits/s/unit area).
With this type of massive and dense communications, energy efficiency will need to be improved in future wireless networks. Developing energy-efficient solutions while sustaining the required technical and performance specifications is inevitable, since almost 70% of the electricity consumed by telecommunication operators is actually consumed in the radio part [9]. To satisfy these specifications within the future generation networks and concurrently keep the energy-efficiency, hybrid beamforming (also known as hybrid precoding) for massive MIMO mmWave is regarded as an essential component of the 5G and beyond (B5G) wireless networks. A complete list of abbreviations that will be used throughout the paper is summarized in Table. 2.

II. CORNER STONE TECHNOLOGIES FOR ENABLING 5G AND BEYOND NETWORKS
The two most important technologies which are widely accepted as principle candidates to meet the 5G network requirements are the mmWave communication and massive MIMO which are somehow related to one another. In the following sub-sections, a detailed explanation to the concepts of mmWave communication and massive MIMO will be manipulated.

A. MILLIMETER WAVE COMMUNICATIONS
While 5G standards are still in progress, the aim for higher data rates, lower latency, and energy efficiency are obvious, and with higher data rates come with the need for wider spectrum bands. Spectral efficiency and bandwidth are the most prominent fields to be investigated to meet this increasing demand. Nowadays, most wireless systems are operating in the 300 MHz to 5 GHz band [8], but the available spectrum in these bands and up to a vlaue of 6 GHz are not sufficiently enough to meet the 5G requirements and has become nearly fully utilized. Also, because the technologies used in the physical layer (i.e. the hardware) have reached Shannon capacity [10], the only alternative is to explore the system bandwidth, in particular, the mmWave band extending from 3 GHz to 300 GHz. The availability of free spectrum at these bands is higher than those at sub 6 GHz bands, therefore significant efforts are excreted on utilizing these spectrum bands.
The small wavelengths at these high-frequency bands enable implementations with much more antenna elements per system within very small form factors. However, mmWave bands also have harsh propagation conditions, including heavy path-loss, high atmospheric and rain absorptions, low diffraction around obstacles and penetration through objects [7]. For example, gas absorption can cause a 60 GHz signal to attenuate 10 dB/km, while a traditional 700 MHz signal undergoes attenuation of only 0.01 dB/km. It is recognized that in specific bands (e.g. 35 GHz, 94 GHz, 140 GHz, and 220 GHz), mmWave propagation undergoes comparatively small attenuation, therefore, a long-haul connection can be realized in these particular mmWave bands. But, in other bands (e.g. 60 GHz, 120 GHz, 180 GH), mmWave propagation exhibits a severe attenuation up to 15 dB/km [11]. The mmWave signals also undergo poor diffraction when confronting blockages because of its short wavelengths [12]. These two characteristics reduce the transmission range of mmWave and cause a frequent drops to the links.
The major difficulties in mmWave systems are spatial management, link margin enhancement, interference control, and blockage. Semiconductors and RF integrated circuits developments for mmWave are growing [13], their costs and power consumption gradually declining, and the other propagation impediments are now considered easier to bypass over time with the continuous and focused effort [14]- [17]. Besides, intelligent large array designs and the use of spatial signal-processing techniques, including beamforming, have been widely exploited to extend the coverage of mmWave networks and overcome path-loss and undesirable interference sources [16], [18]. For example, the link margin can be improved by performing beamforming in large antenna arrays to generate high directional beams. The mmWave short wavelength enables the implementation of a large number of antennas in a small form factor.
Although high mobility management is not anticipated in mmWave networks, the support of low mobility devices 74606 VOLUME 8, 2020 is necessary. Using directional antenna arrays in mmWave bands requires accurate beam alignment since the dominant multi-path components (MPCs) are limited. Moreover, link acquisition and establishment have to be quick and adaptive based on device locations. Thus, beam steering capability is inevitable in mmWave networks, and exploiting the array gain to offset the pathloss effect should be considered carefully. The baseband effective channel can also be utilized efficiently through employing directional arrays. For instance, the channel delay spread can be reduced and consequently alleviating the inter-symbol interference and enhancing the system performance [19]. However, since mmWave systems with large bandwidth require short symbol duration, complex equalization methods are expected [20]. Unlike the current microwave-based architectures, hybrid beamforming (HBF) architecture turns out to be more suitable for mmWave characteristics [21]. The HBF architectures and algorithms in the sub 6 GHz bands can, in theory, be utilized at the mmWave spectrum. In practice, however, mmWave bands exhibit different propagation characteristics and require specific RF hardware. Therefore, innovative HBF methods considering those practical differences are necessary to realize 5G systems. Fortunately, the sparse nature of the mmWave channels can be exploited for channel estimation and beam training optimization by reducing the calculation complexity.
Regardless of the potential advantages offered by the mmWave system, it is obvious that it is a power-limited system as a result of its high blocking and pathloss. Also, it is considered as interference-limited system because of co-channel interference. Therefore, it is essential for the mmWave 5G system to develop an optimal beamforming design to focus the beams into the desired direction to reduce the pathloss and avoid co-channel interference is essential to enable mmWave in 5G systems. Fortunately, mmWave has small wavelengths that allow the implementation of a massive number of beamforming antenna elements in a practical form factor to generate directional beams that can serve the maximum number of mobile devices (MDs). Significant efforts have been focused on different beamforming techniques to improve the massive MIMO system energy and spectral efficiencies. The works in [22], [23] exploited massive MIMO deployment in mmWave systems and examined the integration of both technologies in the 5G networks. It is shown that the mmWave massive MIMO represents as a great potential for 5G networks.

B. MASSIVE MIMO
Massive MIMO increases the data rates and capacities of the traditional MIMO systems and has emerged as an enabling technology for the 5G networks [7], [24]. The combination of massive MIMO and mmWave technologies, in particular, regarded as a fundamental component in 5G to improve spectral efficiency and defeat the bandwidth constraints [15], [25]. Multiple antennas can be leveraged within the transmitter and/or the receiver to achieve multiplexing gain, in which a simultaneous transmission of parallel data streams is transmitted over multiple antennas to improve bit-error-rate (BER), diversity gain where a redundant data streams are sent using space-time coding to reduce BER, or antenna gain by increasing the signal-to-noise ratio (SNR) at the receiver and to eliminate co-channel interference in a multi-user systems.
With proper antenna arrangements, antenna array directional beams can also increase SNR, reduce the root-meansquared (RMS) delay spread due to multi-path scattering at the receiver, and enhance the Rician factor gain [26].
Massive MIMO can also minimize fading effects and the transmission energy using the beamforming gain [27]. Besides, massive MIMO is necessary for mmWave frequencies because it utilizes beamforming gain to improve the link margin. Consequently, the massive MIMO system facilitates additional access to mmWave bands and help increasing the spectral efficiency [22], [28].
The use of massive MIMO holds the potential for higher array gain. Additionally, increasing the number of antennas at the transmitter and receiver, enables higher multiplexing and diversity gains and yield a channel matrix with desirable characteristics [28]. The precoding and combining techniques are utilized to achieve this gain. It was shown that linear precoding and combining techniques such as zero-forcing (ZF) and matched filtering (MF) can provide optimal performance by exploiting the favorable propagation characteristics [24]. However, these techniques need a dedicated radio frequency (RF) chain for each antenna. In conventional MIMO, every antenna is attached to a separate digital base-band. For each antenna, this structure requires dedicated filters, digital-toanalog converter (DAC) or an analog-to-digital converter (ADC), and amplifiers. These series of components that connect the antennas to the baseband is called radio frequency (RF) chain. Thus, using different digital beamforming (DBF) methods at the baseband, the precoding and combining can be performed with full control over the amplitude and phase of the signal at and from each antenna. Unfortunately, massive MIMO also brings some challenges [24]. Considering the massive number of antennas, a large computational load, and more cost are inevitable since RF components for mmWave frequencies are expensive and power hungry [27]. Traditional DBF will lead to higher implementation complexity, especially with mmWave communication systems [18]. Consequently, power consumption and cost are the limiting factors for using DBF in massive MIMO mmWave systems. It is important to develop efficient and cost-effective architecture design that can achieve the potential gain using massive antenna arrays and small number of RF chains.
Massive MIMO technology also introduces new obstacles such as pilot contamination which is a result of intercell or intra-cell interference during pilot transmissions from the mobile devices (MDs) to the base-station (BS) to handle the interference problem. Precise CSI estimation is necessary for wireless systems. In the case of multi-user massive MIMO (MU-massive MIMO) systems, the CSI is significantly important, where it enables interference-free transmission of multiple streams and at the same time eliminates colorredthe interference between users. To estimate the CSI, training sequences or pilots are used. The performance of massive MIMO systems is affected by the CSI availability at the BS. For the BS to obtain the CSI in a massive MIMO system, a time-division duplex (TDD) mode is usually considered. With this mode, the BS estimates the uplink channel and applies the same channel parameters in the downlink. In the frequency-division duplex (FDD) mode, the estimation poses a big challenge because the channel estimation time increases with the transmit antenna number and accordlingly increases the signaling overhead, where the MDs have to feedback the estimated channel parameters to the BS. Besides, high mobility decreases the channel coherence time. Consequently, estimating the channel in the downlink might not be feasible. In contrast, the TDD mode allows lower complexity designs as both the uplink and downlink transmissions use the same frequency at different time slots. CSI availability at the BS is necessary to apply the beamforming to increase the system energy and spectral efficiency. In this context, [29] provides analysis and classification for the pilot contamination problem and demonstrates the main factors that affect the performance of the massive MIMO system using TDD mode such as the need for feedback transmission. However, the study did not consider pilot contamination and estimation errors in HBF architectures. The survey in [30] provides classification and analysis for massive MIMO systems and applications as well as channel measurements and models. However, The leveraging of mmWave technology and the necessity of employing HBF to decrease the system complexity and increase the energy efficiency did not give considerable attention. Motivated by the shortcomings of the study given in [30], this work mainly aims to fill the gaps in the previous studies regarding hybrid precoding and gives a detailed exploration for hardware architectures, methods of deployment, frequency bandwidth, and CSI availability, etc.

III. BEAMFORMING IN 5G NETWORKS
Through beamforming, the signals produced by an antenna array are directed to a specific angular direction [31]. Specifically, beamforming transmits redundant symbols over each transmit antenna with a weighting factor. At the receiver, the received signals are coherently combined using separate weighting factors to increase the SNR. The increase in SNR in large antenna array systems is known as the beamforming gain, while diversity gain is considered as the change in error probability slope resulting from the beamforming gain [32]. In massive MIMO systems, beamforming leverages smart antennas to transmit and receive the signals. Smart antennas are arrays that employ signal processing algorithms to detect spatial signal identifiers; e.g. the direction of arrival (DoA); and use these identifiers to estimate the beamforming vectors which are used to recognize and trace the desired signal transmitted from MDs.
It is favored to have independent weighting control over each element in an active antenna-array to realize the beamforming control and flexibility. This, however, requires a dedicated RF chain for each element. Such a requirement in a massive MIMO system is limited by power, cost, and space. In beamforming, concentrating the radio signal to a narrow beam helps to overcome the effect of reduced propagation associated with very high-frequency carriers in mmWave. However, beamforming antennas at BSs must follow the mobile equipment for the device to remain within the beam, and cost would likely be significantly increased if much more antennas are added to support large numbers of users per cell. 74608 VOLUME 8, 2020 Besides, both horizontal and vertical orientations need to be taken into account when designing the beams.
There are many efforts in the literature to classify beamforming methods based on different aspects. Authors in [33] classified beamforming methods based on their physical characteristics into two classes, namely, switched and adaptive beamforming. They also classified the methods based on the type of antenna arrays to linear arrays, circular arrays, and rectangular arrays. Recently, authors in [34], [35] divided the beamforming methods into ABF, DBF, and HBF. The ABF has the advantage of using low cost phase shifters for massive MIMO, while DBF has the advantage of providing more precise and fast results to acquire user signals, but it also adds more complexity and cost to the design. Therefore, DBF is not suitable for massive MIMO systems. In contrast, HBF is developed to acquire the benefits of ABF and DBF for massive MIMO. Another factor that could be used to classify beamforming methods is the signal bandwidth (i.e. wideband or narrow-band). Using a wideband mmWave instead of the standard narrowband can enhance the beamforming performance. Besides, because of the small wavelength and sharp beamwidth, the antenna array can be extremely small, allowing more antenna elements to be implemented in small form factors while limiting the distance between the BS and MD to a few hundred meters.

A. NARROWBAND VERSUS. WIDEBAND BEAMFORMING
Based on the signal bandwidth, beamforming can be classified into two classes, narrowband and wideband beamforming techniques. Beamforming with narrowband signals can be performed by the instant linear combination of the received array signals. On the other hand, wideband signals require additional processing (e.g. tapped delay lines and sensor delay lines). Current standard wireless technologies are primarily focused on narrowband beamforming, however, wideband beamforming is also growing as an essential part of the 5G network due to the adoption of mmWave which can enhance the beamforming capabilities. Besides, because of the small wavelength and the sharp beam-width, the antenna array can be extremely small, allowing more antenna elements to be implemented in small form factors. Therefore, the mmWave wideband beamforming in 5G can offer remarkably high speeds that approaches the maximum achievable capacities.

B. SWITCHED VERSUS ADAPTIVE ARRAY BEAMFORMING
Beamforming can also be classified into switched or adaptive array systems. The switched method employs a determined or fixed beamforming network that generates predefined beams. The Butler matrix is a well-known solution for switched beamforming which consists of phase shifters, crossovers, and hybrid couplers [36]. To determine the proper beam to acquire the desired signal from a specific MD, the switched system needs a switching network. However, the determined beams might not always point to the desired direction. Besides, several MDs are usually served by the same beam. These issues have been addressed in [37]- [39].
Adaptive array systems in contrast with switched systems can form a unique beam for each MD by applying weight vectors to the detected signals through adaptive array processors to control phase changes between the antenna array elements and their amplitude spreading. This method can form a precise beam and direct the main-lobe towards the preferred MD and produce the null toward the interfering MD. However, the adaptive system requires that the BS updates the MD locations as an estimated DoA of the received signals, while in practice, the DoA estimates of a large number of MDs can be a challenging task. In adaptive array systems, beamforming algorithms are classified into two types; non-blind and blind adaptive algorithms. Non-blind algorithms require prior information of the transmitted signal by using training signals to determine the beam direction. Blind algorithms, on the other hand, require no prior knowledge. In practice, it is easier to implement a switched beamforming system than an adaptive beamforming system. However, adaptive beamforming can decrease the interference between different MDs and achieve better energy efficiency [40], [41]. In general, both methods have their advantages and disadvantages [42], which demonstrate that although adaptive beamforming is hard to implement, most of the massive MIMO studies favor this design over switched beamforming due to its reliability and applicability.

C. HARD AND SOFT ANTENNA SELECTION
To reduce the number of RF chains in massive MIMO systems, hard and soft antenna selection methods are proposed in [43]. A network of switches is used with the hard selection approach to connect the RF chains to the antennas and, based on the design objective (e.g. spectral efficiency maximization), the best antennas set is chosen. Exhaustive search is employed with different combinations of the selected antennas to reach the maximum performance, which introduces high computational complexity. To circumvent this issues, sub-optimal methods to maximize the spectral efficiency based on convex optimization are explored in [44]- [47]. When the number of antennas is significantly larger than the number of RF chains, considerable beamforming gains cannot be realized because of the array gain loss, which prohibits the application of hard antenna selection approaches. In contrast, the RF chains are connected to the antennas through a network of phase shifters in the soft antenna selection [43], [48], [49], which offers better flexibility.

D. ANALOG, DIGITAL, AND HYBRID BEAMFORMING
Another classification of beamforming techniques is the classification into analog, digital, and hybrid beamforming techniques. ABF is the oldest spatial filter which is introduced 50 years ago. Analog beamforming controls the transmitted signal phase through low cost phase shifters while employing RF switches to steer the beams. Analog or RF beamforming, as shown in Fig. 1, are typically implemented using phase shifters where all the antenna elements share a single RF chain and the beamforming matrix weights are constrained with the phase shifters constantamplitude. The ABF has been extensively studied in the literature and extensively explored in mmWave MIMO systems in particular [34], [50]- [53]. However, most of the approaches introduced in litrature provide insufficient antenna gain and moderate performance. In particular, these efforts have not exploited with the fully connected mmWave massive MIMO channel structure. ABF offers lower complexity compared to DBF, but its performance still smaller due to the lack of amplitude control. A solution for this problem has been proposed in [45], [51] by utilizing simple analog switches to design the antenna subset selection. However, this design can only provide limited array gain and lower performance in correlated channels.
Based on mmWave realistic channels structure, a low complexity design is presented in [54]. A clustered channel model is formulated to exploit the limited scattering at high frequency and antenna correlation for single-user precoding in practical transceiver architectures. ABF has been used for short-range mmWave systems such as in IEEE 802.15 standard [55]. However, the phase shifters used in ABF impose the constant modulus constraint to the system. With each Beamforming criteria, the beamforming design problem is formulated to an optimization problem. Solving such optimization problem may be a challenging task. The solution becomes more challenging with the consideration of phase shifters' practical constraints which adds large computational load to the system based on the value of quantization resolution.
In contrast to ABF, the DBF employs a digital signal processor to perform beamforming, providing a higher degree of freedom (DoF) that allows more flexibility in implementing effective beamforming algorithms. As shown in Fig. 2, each antenna element in DBF systems requires a separate RF chain and additional operations such as DoA estimation, adaptive steering of its beams and nulls to improve the signalto-interference-plus-noise ratio (SINR), and programmable  control of antenna radiation patterns. In general, to achieve optimal MIMO channel capacity, the availability of full CSI at the transceiver is crucial and digital processing at both ends is required. This, however, demands a dedicated RF chain for each antenna to deliver higher DoF. However, using a large number of RF chains will produce a complex architecture with high-power consumption due to its complexity and highpower consumption of its mixed-signal circuits. Moreover, due to the large number of antenna elements in massive MIMO, DBF can also be expensive to realize. Hence, the fully DBF solution is not viable for implementation for massive MIMO systems at mmWave frequencies.
In conclusion, by using low cost phase shifters, ABF is easier to implement and more cost-effective when compared to DBF but achieves lower performance due to the lack of phase shifter amplitude control. As shown in Fig. 4, the fully DBF outperforms the ABF and achieves higher spectral efficiency. To realize an optimal trade-off between the cost of the analog and the performance of the DBF, the HBF architecture has been proposed [56]- [58]. HBF is broadly employed in massive MIMO systems where baseband signals are generated in the DBF while the analog part accounts for the RF chains by decreasing the quantity of ADCs and DACs, which in turn enhance power amplifier performance or modifies the structure of the mixers to reduce the costs. In this structure, as shown in Fig. 3, a network of phase shifters connects a small number of RF chains to a large number of antennas.
It is envisioned that HBF will achieve the gains of analog and DBF. In this context, HBF architecture has appeared as a promising solution for the next generation of mmWave massive MIMO systems [59]. Accordingly, HBF architecture has a critical value when implementing massive MIMO systems because of its energy and cost-efficiency [20]. In mmWave massive MIMO, the analog and digital signal components for each antenna make it impractical to perform all the signal processing tasks with full-digital architecture because of the high cost and energy consumption [28], [48]. This therefore motivates us to develop different HBF architectures.

IV. HYBRID BEAMFORMING
Hybrid ABF and DBF was first suggested in [43] under the term ''soft antenna selection''. The hybrid architecture utilizes the digital precoding and analog beamforming methods to balance the cost and performance of both approaches. The HBF can be seen as a spatial filter that has the ability to strengthen the desired signal components as well as reduce the impact of undesired signal components. The huge interest in HBF is driven by the fact that the number of RF chains is only lower-limited by the number of transmitted data streams, while the diversity and beamforming gains are limited by the number of antenna elements [32].
Massive MIMO arrays offer a large number of DoFs, which improves the wireless system performance by reducing the channel fading effect. In DBF, every antenna element has to be connected to at least one RF chain which in turn leads to a huge complexity and cost with mmWave massive MIMO systems [18]. In contrast, the HBF employs analog phase shifters with lower RF chains which leads to low complexity and cost-effective system with virtually the same performance [35], [60]- [69]. With HBF, the beamforming process could be implemented within both the analog and digital domains. In the digital domain, the transmitting signal is first processed using a digital precoder in the digital domain, since no RF chains are required in this part, the signal dimension is not high and low-dimension precoder could be used. In the second part, traditional analog beamforming, with is typically built based on RF phase shifters, is employed to direct the digital output to the antenna arrays with low complexity. At the receiver, the same procedures are performed in a reverse order. In general, HBF adjusts an efficient tradeoff between the spectral, energy efficienies, and hardware complexity to take the potential advantages of both ABF and DBF [48], [70]. The main benefits of using HBF architectures can be summarized as follows: • HBF enables practical and efficient use of the massive MIMO mmWave systems with great capacity improvements. Unlike HBF, full-DBF is more complex and expensive and analog beamforming suffers from imprecision and interference problems [49].
• HBF minimizes the hardware cost since it employs fewer RF chains at the transceivers compared to fully digital architecture with the same number of antennas [57].
• HBF is more energy-efficient than legacy MIMO systems operating in traditional microwave bands. HBF allows the realization of the large antenna arrays in mmWave massive MIMO which minimizes the consumed power in the transceivers. In the uplink, HBF can also achieve more energy efficiency because it reduces the power consumption for each MD without affecting the performance [71], [72].
• HBF designs leverage additional DoFs compared to analog beamforming by adding the extra digital precoding stage to support multi-stream and multi-user transmission.

V. HYBRID BEAMFORMING CLASSIFICATION
There are only a few studies that demonstrate the hardware features and classification of HBF architectures. Authors in [25] classified HBF architectures at the BS side based on the CSI availability (i.e. full or estimated), the frequency bandwidth (i.e. wideband or narrowband), and the architecture complexity (i.e. full complexity, reduced complexity, and switched). Authors in [28] classified HBF architecture based on the components that build the used ABF, to switchesbased HBF (S-HBF), phase shifters based HBF (PS-HBF), and lens antenna arrays based HBF (LNA-HBF). In the S-HBF architecture, the sparsity of the mmWave channel is exploited, where only a subset of antennas is chosen instead of optimizing all the quantized phase values. The PS-HBF experiences quantization errors because of the phase shifters finite step and consumes more power, however, it also can reduce the residual interference between data streams. In the LNA-HBF architecture, an LNA is used in the ABF stage instead of the switches and phase shifters, and the continuous VOLUME 8, 2020 aperture of the array directs the transmitted beams. This study [28], concentrated on the analog stage of the HBF without addressing the DBF stage. As highlighted above, the HBF design problem can be considered based on several criteria in different situations. For instance, the spectral efficiency maximization can be examined in SU or MU scenarios, narrowband or wideband channels, with full or estimated CSI, and by using joint or separate optimization design. In the following sections, a comprehensive classification of HBF and a survey of the existing works in each class are presented.

A. FULLY-CONNECTED AND PARTIALLY-CONNECTED HYBRID BEAMFORMING ARCHITECTURES
Based on the connectivity of phase shifters network, PS-HBF architecture is commonly divided into two configurations, fully-connected and partially-connected architectures. Fig. 5 shows the fully-connected architecture, all RF chains are connected to all the antennas. In this architecture, a large number of phase shifters are employed to fully map all the RF chains to all antenna elements. In the partially-connected architecture shown in Fig. 6, every RF chain is connected to a subset of antennas and each antenna of this subset is attached to a phase shifter. This architecture has many advantages compared to the fully-connected architecture such as lower complexity and easy of implementation. Furthermore, the hardware cost is also decreased as fewer RF chains are needed. On the other hand, since RF chains are attached to fewer antenna elements, several undesired issues arise, such as the lower spectral efficiency, weaker directivity, wider beam-width, and other chains interference. Despite these limitations, MIMO systems help to reduce the interference in the partially-connected architectures. Furthermore, partiallyconnected architectures with its remarkable low complexity and accordingly low power consumption make it more practical for mobile stations implementation in the uplink. Fullyconnected architecture offers better spectral efficiency and higher beamforming gain, but it also requires more power and it is harder to realize because of the higher number of RF chains and its inter-connections [73], [74]. Therefore,  partially-connected architecture is more practical and costeffective, however, fully-connected architecture is still regularly considered in academic works.
The purpose of all HBF architectures is to minimize the signal processing and hardware complexity while achieving near-optimal performance. As discussed earlier in the previous section, the fewer RF chains in partially-connected architectures provides better energy efficiency with a remarkable decrease in spectral efficiency, therefore the collaboration of such architecture and massive MIMO, where spectral and energy efficiency are maximized, is anticipated to deliver significant energy and spectral efficiency improvement for 5G and beyond network.
Dynamic partially-connected architecture is shown in Fig. 7, and its main goal is to dynamically adapt the average channel statistics by using switches and phase shifters, then employing a low complexity greedy algorithm instead of an exhaustive antenna search [75]. Its spectral efficiency is higher than the partially-connected architecture but still lower FIGURE 8. Fully-connected virtual sectorization, a special type of fully-connected architecture in which a separate digital beamformer is connected to every virtual sector where N S,i , is the number of data streams for the i th virtual sector with i = {1, · · · , N}, N BS RF ,j is the number of BS RF chains at the j th sector with j = {1, · · · , M}, and N BS is the number of BS antennas.
than the fully-connected one and it also consumes less power than ABF. However, the use of switches in the mmWave causes high insertion losses [76]. The virtual sectorization architecture in Fig. 8, is a special type of fully-connected architecture in which a separate digital beamformer is connected to every virtual sector. This architecture has the same spectral efficiency as that achieved with the fully-connected architecture.

B. HYBRID BEAMFORMING IN SINGLE-USER AND MULTI-USER SCENARIOS 1) SINGLE-USER SCENARIO
The sparsity nature of mmWave channels offers a great advantage over conventional sub 6 GHz channels, which lies in the available fewer spatial DoFs. The sparsity can be exploited to simplify both the channel estimation and beam training procedures. In SU-MIMO systems, the simplest form of HBF exploits the channel sparsity and concentrates the array gain to a limited number of multipaths in the ABF stage, while multiplexing data streams and allocating powers in the DBF stage. But, it turns out that such a hybrid architecture is only asymptotically optimal in conventional MIMO systems [77]. For massive MIMO with large array size, optimal HBF architectures and algorithms are still not fully understood. Also, Hardware and computational complexity reductions are highly investigated due to the unique features of mmWave massive MIMO systems. Several HBF methods have been proposed for mmWave SU-MIMO channels. Among the proposed approaches for SU-MIMO mmWave systems are the codebook-based beamforming, spatially sparse precoding, and antenna and beam selection methods.
• Codebook-based beamforming: Instead of estimating the large channel matrix directly at the receiver, the codebook-based beamforming approach employs a pre-defined set of beams to perform downlink training and only feeds back the selected beam index to the transmitter. With large antenna array systems, in particular, with a fully-connected hybrid architecture, the beam search over a large space can be complex and the feedback process imposes a large overhead. The mmWave sparsity can be exploited in the design of the codebook to reduce the beam search complexity and the feedback overhead. Each codeword is created based on the orthogonal matching pursuit (OMP) algorithm to minimize the mean square error (MSE) with a pre-defined number of RF chains equal to the number of the antenna beam patterns.
• Spatially Sparse Precoding: This technique can attain the same performance level as the fully DBF does. With mmWave channels of a sparse nature, the number of dominant multipaths compnenents is small, the optimal precoder can be acheived by using a finite number of antenna elements [48]. The multi-path sparsity restricts the ABF to a set of array response vectors, and the baseband precoder optimization can be formulated into a matrix reconstruction with a cardinality constraint on the number of RF chains. Analog combiner near-optimal solution can then be determined using sparse approximation techniques (e.g. OMP) [48].

2) MULTI-USER SCENARIOS
Recently, HBF is heavily considered in mmWave MU-MIMO systems. The hybrid architecture at the BS can help it to multiplex and transmit data streams to multiple MDs equipped with single antenna or an array of antennas. When several MDs, each with a single RF chain and many antennas are considered, the strongest beam pair is selected and this in turn helps the ABF and ZF digital precoding to alleviate the inter-user interference. In this scenario, the hybrid architecture significantly outperforms the analog beam steering approach. HBF based on beam selection and Beamspace-MIMO (B-MIMO) approaches can also be applied to MU-MIMO systems with linear baseband digital precoders. Despite of its effectiveness and advantages compared to fully-digital precoders in mmWave channels, the use of HBF with mmWave MU-MIMO systems exploits many new challenges which need further research efforts. Among such challenges, we can mention the multi-user scheduling, 2D and 3D designs of lens antenna arrays. Another important aspect that should be explored is the hardware imperfections of RF transceivers which degrade the spectral efficiency in many ways. For example, it is harder to precisely create the desired transmit signals when high beamforming gain is required. That is due to the fact that non-linear distortion at the receiver depends on the instant channel gain, which in turn affects the SNR. Transceiver imperfections are more obvious at mmWaves frequencies, and subsequently the spectral efficiency and SNR of hybrid precoders and combiners do not grow linearly with the number of RF chains. Knowledge of the statistical characteristics of transceiver imperfections at mmWaves frequencies is required to fully recognize the spectral efficiency VOLUME 8, 2020 scalability in large MIMO systems. Furthermore, with SU transmission, the channel capacity can be achieved with the transmission design based on singular value decomposition (SVD) and water-filling. However, in MU transmission, SVD calculation at both sides becomes more complicated due to the lack of collaboration between MDs. Usually, dirty paper coding is employed to achieve the channel capacity, but its complexity diminish its realization in practice. Accordingly, low complexity sub-optimal linear algorithms, such as, ZF are still in use and its ability to achieve optimal sum-capacity for massive MIMO also has been demonstrated [24].

C. HYBRID BEAMFORMING WITH PERFECT AND ESTIMATED CSI
The performances of Beamforming algorithms rely mainly on the availability and quality of CSI at the transceivers. In practice, the resources assigned to the process of channel estimation are limited with both time division duplex (TDD) and frequency division duplex (FDD) modes, which in turn degrades the estimation accuracy and affects the beamforming performance. Moreover, the channel estimation process becomes considerably difficult within massive MIMO systems and many research efforts have been devoted to minimize the overhead and complexity of such a process. In FDD mode, a series of training pilots are broadcasted from the transmitter to the receiver to estimate the channel and then feedback the channel parameters to the transmitter. In this situation, the estimation error depends mainly on the noise and the limited number of feedback bits. In massive MIMO systems with large number of antennas, the estimation and signaling time may exceed the channel coherence time, which in turn lead to the generation of interference. In a TDD mode, it is necessary to calibrate the uplink and downlink circuits. Also, because the number of orthogonal pilots is related to the number of users, users sharing same pilots in different cells can interfere with each others. In HBF architectures with RF chains of numbers significantly smaller than the number of antennas, the aforementioned problems might become more challenging. Specifically, reducing channel estimation signaling is a key design objective in HBF which limits the achievable spectral efficiency. Another challenge may arise due to the rapid channel variations and its accompanied estimation delay, which in turn can affect the performance, particularly with high mobility of the MDs in outdoor conditions. Additional difficulties in the design of the channel estimation algorithms can also appear in practice. For instance, beamformer designs that are based on the antenna array geometry, can not be achieved in practice due to the neglection of user fingers effect on the array gain [78]. Based on the above discussion, we have classified the works that concentrate on the effects of channel estimation on the HBF, as follows: • HBF Based on Perfect CSI: Finding the ABF and DBF matrices, which are the two parts of the HBF, is a challenging task even when full CSI is available at the transmitter [48]. The difficulty stems from several facts, e.g., (1) hybrid beamformers and combiners at each end are coupled, which results in a non-convex optimization problem; (2) Using phase shifters to design the analog precoder and combiner inflicts additional constraints; (3) When phase shifters with finite-resolution are used, the optimal analog beamformers can be obtained from a discrete finite set and results in a non-deterministic polynomial-time hardness (NP-hardness) problem.
• HBF Based on Estimated CSI: The added overhead to estimate the CSI is a significant challenge in HBF designs, which depends on whether the system employs TDD or FDD. Massive MIMO downlinks with a digital structure allow a spatial multiplexing gain (SMG) of Here, N BS , K , and N s stands for the number of antennas at the BS, the number of single-antenna users, and number of data streams, respectively. In massive MIMO with full CSI, the highest possible SMG is constrained by the channel coherence block size due to the large values of K and N BS . To solve this issue, low dimensional CSI may be used to reduce the overhead. With the employment of FDD mode, the overhead increases because of the required uplink feedback for each antenna [25], [79]. One way to minimize the training overhead is to use low resolution CSI. In this context, several papers addressing the HBF designs can be divided into two stages. Based on the estimated CSI only the ABF is performed in the first stage, while in the second stage, DBF is performed based on the full CSI [48].

D. HYBRID BEAMFORMING WITH NARROWBAND AND WIDEBAND CHANNELS
With narrowband transmission, the channel does not change over the coherence bandwidth, and the signal bandwidth should be less than or equal to the channel coherence bandwidth. Basically, an optimal design of DBF depends on the evaluation of SVD of the channel matrix. Specifically, the right singular vectors and water-filling approach are used to evaluate the optimal precoder while the left singular vectors are used to evaluate the combiner. In practice, the assumption that the signal bandwidth is less than or equal to the channel coherence bandwidth in narrowband systems is very hard and can limit the system data rate. Therefore, wideband beamforming systems are proposed to get rid of this this technical limitation. With wideband transmission, the channel is decomposed into sub-bands where the channel remains constant over each sub-band, and accordingly narrowband beamforming algorithms can then be applied over each subband based on the performance requirements. The augmentation of OFDM and MIMO technologies [80] represents one of the standard wideband beamforming algorithms which requires a full-digital scheme to control the phase and amplitude of the signal for each sub-carrier. However, in HBF, the ABF stage decreases the DoFs of the wideband system.
The phase shifter network produces the same phase shift for all sub-bands, making the wideband HBF problem more challenging [56], [81]- [83].

E. HYBRID BEAMFORMING BASED ON SWITCHES, PHASE SHIFTERS, AND LENS ANTENNA ARRAY
In HBF, the limited number of the RF chains is mapped to a massive number of antennas via an analog network of phase shifters and/or switches. The proposed hybrid architectures for mmWave massive MIMO systems are mostly based on phased arrays. However, in mmWave systems, the phase shifter network can be complex and power consuming [84]- [86]. Besides, in practice, the finite phase shifters used in phased arrays are not sufficiently precise for accurate beam steering. Furthermore, increasing the quantization bits to improve the accuracy incurs more complexity and power consumption. Additionally, employing passive instead of active phase shifters to reduce the power consumption is not feasible due to the higher insertion losses [84].
On the other hand, switches-based architectures can achieve lower power consumption with the same number of RF chains [76], [87]. However, switches-based architectures offer lower array gain when compared to the phase shifters which in turn leads to a reduction in the spectral efficiency. If equal power consumption is considered, switches-based architecture can use more RF chains to improve spectral efficiency. In general, the switches network is lower, but also offers lower spectral efficiency in comparison with the phase shifters structures [43].
The HBF architecture can also be realized using continuous aperture phased MIMO (CAP-MIMO) at the transmitter and receiver. In this architecture, lens antenna arrays are adopted instead of the switches or phase shifters to realize the beamspace MIMO (B-MIMO) [88]. A large lens antenna array is triggered through the array feed, known as the beam selector, which controls the angles of the focused beams generated from the antenna lens. Similar to the spatially sparse precoding, with the help of a limited number of RF chains to select a sub-set of antennas, the CAP-MIMO is able to leverage high gain low-dimensional beam-space, which results from the utilization of the sparse nature of the multi-path mmWave channel.

F. HYBRID BEAMFORMING USING JOINT OR SEPARATE DESIGN FOR DIGITAL AND ANALOG BEAMFORMERS
HBF has two distinguished characteristics inherited from both ABF and DBF, which are: 1) the low number of RF chains that help to reduce the digital processing complexity, and 2) the constant amplitude of phase shifters which help to reduce the complexity of the analog processing and allows a phase-only control. These two characteristics enable us to considerably decrease the overall complexity of massive MIMO mmWave system. It is a challenging task to maximize the spectral efficiency and determine the optimum matrices of both DBF and ABF stages under the aforementioned characteristics/constraints, which can be addressed by two different types of methods. The first one is to design the analog and digital precoders and combiners with the adoption of joint optimization techniques. The second type adopts two-steps approach, that is in the first step, the analog precoder and combiner are optimized based on a specific criteria, and then the digital precoder and combiner are determined in the second step to improve the performance [93], [94].

VI. HYBRID BEAMFORMING CHALLENGES
With the mmWave massive MIMO system, HBF designs attempt to solve several key challenges as follows:

A. CHANNEL ESTIMATION
The CSI is required to achieve optimal beamforming gains. Channel estimation and beamforming for HBF are more challenging compared to fully digital systems. This is due to the fact that the number of the RF chains is significantly smaller than the number of antennas, and the constraints imposed by the phase shifters turns the design process to nonconvex optimization problems that are very hard to solve. Therefore, channel estimation process for HBF based system can be considered as an optimization problem, in which the optimal beams for the transmitter and receiver are known. In the design of channel estimation techniques for HBF based systems, it is required to minimize the channel estimation overhead in the system which limits the spectral efficiency. Compressive sensing methods can be employed to exploit the mmWave channel sparsity in the angular domain to reduce the number of estimated channel parameters and the complexity of the process. This in turn will result in a significant decrease in the dimensions of the channel matrix and the complexity of the precoder and combiner designs. Additionally, in high MDs mobility scenarios, rapid channel variations lead to an estimation delay which it turn can affect the performance.

B. PHASE SHIFTERS CONSTANT MODULUS AND FINITE RESOLUTION
Another challenging point with designing hybrid beamformers arises when the finite resolution phase shifters of constant modulus characteristics are considered, which transform the HBF optimization into a difficult non-convex and combinatorial problem. Besides, since the hybrid beamformer performance depends on the joint optimization of analog and digital beamformers, HBF design schemes are different from those in the conventional MIMO systems beamforming. Actually, the conventional systems consider either fully-ABF or fully-DBF approaches. With digital phase shifters, the optimization problem becomes combinatorial problem with a large search space, especially with massive MIMO systems. The use of a predefined set of phases or codebooks can reduce the search space and complexity. Unfortunately, the codebooks can fit perfectly with only special sets of channel structure, such as sparse channels. VOLUME 8, 2020

C. OPTIMAL NUMBER OF RF CHAINS
A significant challenge with HBF architecture designs is to determine the optimal number of RF chains while considering multi-stream transmission with large number of antenna elements. The main design goal is to achieve DBF optimal performance considering the hardware complexity and power consumption limitations.

D. PRECODER AND COMBINER DESIGN
Joint design of digital and analog precoders usually aims to maximize the spectral efficiency under both angle quantization and constant amplitude/modulus constraints. It is useful to design algorithms that allow parallel hardware architecture to achieve effective signal processing and very-large-scale integration (VLSI) implementation. Also, it poses a challenge to put forward an optimal joint design of precoder and combiner, and develop algorithms that are able to minimize the inter-cell interference as well as the out-of-cell interference in MU-MIMO systems.

E. BEAM TRAINING AND FEEDBACK
The hybrid precoders and combiners design is typically constrained with the constant-gain analog phase shifters and the low-resolution quantized phase control. Therefore, there is a need to develop optimal beamforming algorithms and codebook designs. Additionally, optimal reference signal for HBF can also decrease the training and feedback overhead, and accordingly helps to achieve a power-efficient and lowcomplexity system. Different solutions have been considered to circumvent this issue. For instance, user localization is employed in the literature to keep knowing the channel conditions, and different equalization techniques are exploited to maintain the adaptation with dynamically varying channel conditions.

VII. RECENT STUDIES REGARDING HYBRID BEAMFORMING IN THE LITERATURE
In this section, we will classify HBF techniques based on the design criteria. Up to the best of our knowledge, this is the first time to address the HBF taxonomy, which will help the designers of 5G and beyond networks to decide on which HBF approach they should deploy their networks.

A. HBF WITH FULLY AND PARTIALLY-CONNECTED ARCHITECTURES 1) FULLY-CONNECTED ARCHITECTURE
Despite the fact that fully-connected HBF architecture is harder to implement compared with the partially-connected architecture, it is still in use due to its higher spectral efficiency. In [18], an optimization framework is presented to obtain the optimum number of RF chains as well as the number of antenna elements in each transceiver in HBFbased mmWave system, which in turn proves the advantages and disadvatages of fully-connected HBF approach. Specifically, the authors determined the optimal configurations that maintain an efficient tradeoff between the spectral and energy efficiencies with different number of antennas and RF chains when the static energy consumption is considered. The results could serve as a reference to determine different configuration parameters based on the application requirements. Additionally, the authors in [43] have introduced an antenna selection approach in fully-connected HBF architecture to avoid the channel fading effects suffered with conventional antenna selection approaches. To achieve this, the proposed method requires a variable phase shifter to control the number of RF chains to be involved in the operation. Despite this approach slightly increases the hardware complexity, it is able to achieve higher multiplexing and diversity gains. In [73] the authors attempted to achieve optimum energy efficiency at the BS by reducing the number RF chains and baseband energy consumption. An energy-efficient algorithm is proposed to evaluate the optimum number of RF chains and an efficient tradeoff between the hardware cost and energy efficiency has been achieved. The results illstrated that the joint optimization of the hardware cost and energy efficiency has led to a 170% improvement. The work in [89] has also optimized the energy efficiency through designing a HBF for the downlink of mmWave MU-massive-MIMO system. The proposed architecture employs a two-stage approach. In the first stage, the ABF is employed to improve the link margin and to choose the optimum beam to maximize the desired user power while minimizing the interference of all other users. In the second stage, DBF utilizes a zero-gradient strategy that improves the spatial multiplexing gain and maximizes the energy efficiency of the desired user. Authors in [68], [90] have studied the fully-connected HBF architecture at the uplink and downlink of a massive MIMO system. They considered a finite-resolution phase shifter and a limited RF chain number. Furthermore, they proposed a heuristic algorithm for optimizing the system sum-rate which has been demonstrated that the acheivable rate can be increased when the RF chains number is either greater than or equal to the data streams number. Moreover, in [91] a fully connected HBF architecture is also considered for downlink, and its effect on both the data rate and coverage performance in the case of MU-MIMO, SU with spatial multiplexing, and SU with analog beamforming, have be investigated.
The concept of codebook-enabled HBF has been proposed in [83] for mmWave systems with limited feedback between transmitters and receivers. The studies in [48], [60], [88], [92] have proposed the adoption of orthogonal matching pursuit (OMP) and gradient pursuit (GP) algorithms for the fully-connected HBF architectures to achieve higher performance based on the excellent channel estimations. As shown in Fig. 9, the OMP-based HBF approach has provided a comparable performance to that provided by the fully DBF in terms of the spectral efficiency. However, since it is considered as a sparsity constrained matrix completion problem, the HBF design needs to maintain a good tradeoff between the performance loss and the simplicity. Moreover, extra overheads are also incured to acquire the array response vectors 74616 VOLUME 8, 2020 information to exploit the sparsity. In [68] a fully-connected HBF is considered for both the uplink and downlink of mmWave massive MIMO system. The authors demonstrated that the performance of the HBF-based system can equivalently be acheived with fully DBF-based architecture using a fewer number of RF chains. However, it has to send at least twice the number of data streams. In [95], a practical MU massive MIMO HBF system is investigated with ZF precoding which is used in the DBF stage and the beam selection process is executed within the ABF stage. The proposed system showed that the HBF with more RF chains can offer better performance than the fully-digital architecture. In [78], two low-resolution ADCs models are introduced while employing fully-connected HBF architectures in both the uplink and downlink. The two models exploit different channel features, different antenna configurations, and different hardware limitations. A single comparator is also used to implement a 1-bit ADC in the second model to reduce the power consumption in the power-limited mobile devices (MDs), while the first model achieves better performance in the backhaul connections. Finally, an iterative algorithm is proposed in [96] to implement the HBF architecture. In this study, only the discrete set of phase changes can be provided by the phase shifters to improve the spectral efficiency and enable low-cost implementation of the ABF using practical finite resolution phase shifters.

2) PARTIALLY-CONNECTED ARCHITECTURE
In [56], [97], a partially-connected HBF architecture is designed, where the transmitter and receiver are equipped with many antenna sub-arrays, with independent phase shifters, that are used to design the beam steering. In this design, the transmitted signal is adaptively modified based on the mmWave channel characteristics. The partiallyconnected HBF architecture is also investigated in [98] where the authors have introduced an efficient solution to the peak-to-average-power-ratio (PAPR) problem in massive MIMO-OFDM system under total average transmission power constraints. To realize the optimum acheivable rate under transmit power constraints, joint optimization of the digital and analog beamforming matrices is proposed in [99]. With this architecture, each element of the uniform linear array (ULA) is attached to a different RF chain. The proposed solution provides an enhancement over the beam-steering method with various antenna configurations. Another partially-connected HBF architecture is also proposed in [100] where a low complexity iterative algorithm is used first to design the DBF stage followed by the ABF stage. In [101], the phased arrays based ABF stage is used such that the HBF can efficiently combine or distribute signal energy in sparse mmWave channels while the digital RF chains perform the multiplexing for more flexibility. The study reveals that HBF with phase shifters network acheives better performance with narrowband signals, while the more complex tappeddelay beam steering approach is more suitable for wideband signals. The authors in [102], present a partially-connected HBF that is based on maximizing the spectral efficiency for downlink multi-user MIMO (MU-MIMO) mmWave systems while full CSI is available at both the users and BS. Finally, the authors in [75] proposed a dynamic partially-connected HBF architecture that uses the channel statistics as a criteria for sub-array selection. To achieve optimal sub-array selection, exhaustive search approach is replaced by a greedy algorithm with low complexity which achieves nearly the same spectral efficiency level.
In [103], an iterative hybrid precoding algorithm that is based on the concept of successive interference cancellation (SIC), is proposed for the partially-connected architecture. The authors assumed a diagonal digital precoding matrix that allocates the power to different data streams. Moreover, they assumed an equal number of RF chains and data streams, which makes the beamforming gains for the analog precoder to dominate the gain acheived with the whole HBF architecture. As shown in Fig. 10, the SIC algorithm outperforms the ABF. However, it only provides a sub-optimal solution compared with the fully DBF and HBF. In [104], the authors presented a HBF approach by applying a semi-definite relaxation algorithm and developed an alternating minimization approach, namely SDR-AltMin, to evaluate the optimum values for both the digital and analog precoders. As shown in Fig. 10, the SDR-AltMin algorithm offers a remarkable enhancement compared to ABF. Also, the SDR-AltMin algorithm outperforms the SIC-based algorithm. This is due to the fact that the digital precoder in the semi-definite relaxation is able to fully exploit the capabilities of DBF in steering the transmitted signals in contrary with the SIC-based algorithm which uses the digital precoder to only allocate the power to the data streams.

3) FULLY-CONNECTED AND PARTIALLY-CONNECTED COMPARISON
The fully-connected HBF architecture provides better performance compared to the partially-connected counterpart in terms of the spectral efficiency for any combination of transceiver antennas in the ABF and DBF approaches. However, the energy efficiency of the fully-connected HBF VOLUME 8, 2020 architecture is lower than that provided with the partiallyconnected counterpart. The spectral efficiency deterioration caused by applying the partially-connected architecture can be balanced by increasing the number of antenna elements. Accordingly, adjusting the connectivity should be done to maintain an optimum tradeoff between the performance loss and the number of phase shifters (i.e. cost and complexity) in the HBF architecture. The works in [18], [25], [75] provided a comprehensive comparison between fully-connected and partially-connected architectures. Moreover, in [104], an efficient alternating minimization based algorithms are proposed to design the HBF with either full and partiallyconnected architectures while formulating the HBF design as a matrix factorization problem. The study reveals that the complex fully-connected architecture can achieve the same performance as that given by the fully DBF if the number of RF chains is larger than the number of data streams. Additionally, the partially-connected HBF architecture achieves better spectral efficiency compared with the fully connected counterpart. Moreover, the study also investigated the performance of the HBF design with both narrowband and wideband channels and also with partial CSI. In the partiallyconnected case, an alternating minimization algorithm based on semi-definite relaxation (SDR-AltMin) is used to design the analog and digital precoders stages. In the fully-connected architecture, the authors proposed to implement the analog precoder stage by taking into consideration the unity modulus constraint and carrying out the optimization over Riemannian manifold to maintain an accurate solution. Furthermore, the authors have proposed an alternating minimization algorithm based on manifold optimization, namely MO-AltMin, for designing the analog precoding stage. Regarding the digital precoding, the authors developed also an alternating minimization algorithm based on phase extraction. It turns out that, the partially-connected HBF architecture achieves higher energy efficiency while the fully-connected architecture provides higher spectral efficiency. On the other hand, the authors in [105] have introduced heuristic algorithms to implement both the digital and analog precoding stages for the HBF. The proposed approach has achieved a nearoptimum spectral efficiency for fully-connected architecture with significantly smaller number of RF chains. Another important contribution, that is worth mentioning here, is the work presented in [106]. In this work, an overlapped subarray (OSA) design is proposed for fully and partiallyconnected HBF architectures. Specifically, a unified low-rank sparse recovery algorithm is used to design the HBF for mmWave MU massive MIMO downlink system. It turns out that, this strategy maintains an optimal trade-off between the achievable performance and system complexity.
In Fig. 11, the spectral efficiency is plotted versus SNR for different HBF approaches with fully ABF and DBF approaches as the baselines. We have assumed an equal number of RF chains and data streams (i.e., N T RF = N R RF = N S ) as the worst scenario). This is due to the fact that, in practical systems, the number of RF chains cannot be smaller than the number of data streams. In this scenario, for fullyconnected architecture, the OMP algorithm performs significantly lower than the optimal fully DBF in terms of the spectral efficiency due to the fact that the OMP algorithm performs poorly when N RF quals N S . In contrast, the manifold optimization algorithm based on alternating minimization (i.e., MO-AltMin) [104] delivers near-optimal performance over the entire SNR range although the recommended N RF number is not employed (i.e., N RF >= 2N S ). The partiallyconnected architecture achieves considerable performance gain over the fully ABF with the alternating minimization based SDR algorithm (SDR-AltMin) [104]. However, the acheivable spectral efficiency is much higher with higher SNR levels. The SIC-based approach in [103] is also included in the comparison as a benchmark for the partially-connected architecture. As shown in Fig. 11, the SDR-AltMin algorithm outperforms the SIC-based algorithm since the SDR-AltMin algorithm fully exploits the digital precoder in contrast with the SIC-based technique which only utilizes the digital part to distribute the power to the available data-streams. In Fig. 12, the spectral efficiency versus SNR is provided for different beamforming approaches, assuming the optimal values of N S and N RF (i.e., NRF >= 2NS). It is noticed that the performances of SDR, OMP, PE-AltMin, and MO-AltMin approaches are improved due to the increase in N RF . The performance gap between OMP and alternating-based algorithms shrinks and the three iterative algorithms achieve a near-optimal performance over the entire SNR range.

B. HYBRID BEAMFORMING BASED ON FULL CSI AND ESTIMATED CSI
In the literature, it is observed that the presented strategies for implementing HBF are generally based on either estimating the CSI or assuming the availability of full CSI at all the network nodes. In the following two sub-sections, we explain the concepts behind both the full and estimated CSI and their impact on the design of HBF.

1) FULL CHANNEL STATE INFORMATION
The two main approaches to achieve practical near-optimal solutions and overcome the challenges discussed in section VI are addressed as follows: 1) Optimal beamformer approximation: In the fully digital structure, optimal beamforming for SU-MIMO can be obtained through evaluating the SVD of the channel matrix. Another method based on eigen-value decomposition is proposed in [43]. A different approach is also addressed in [48] which obtains near-optimal hybrid beamformers by reducing the Euclidean distance to the fully digital solution. In mmWave sparse channels, reducing the Euclidean distance produces a quasi-optimal solution. In the traditional non-sparse channels, analog and digital beamformers can be evaluated using alternating optimization based algorithms. 2) Decoupling the design of the analog and digital beamformers: the coupling between analog and digital beamformers, as well as the coupling between the transceivers, are major challenges in HBF. To reduce the complexity, decoupling the beamformers is a critical issue, which can be achieved sequentially. For example, we can cancel the effect of the combiner on the precoder by assuming a fully digital MMSE-based receiver to increase the sum rate for SU-MIMO, and then assuming a unitary-based fully digital precoder to decouple the analog and digital precoders. Finally, the precoder matrix is optimized column-by-column by forcing the phase-only constraint on each antenna and a closed-form formulation of the digital precoder can then be obtained. Several different decoupling approaches have also been investigated in [25].

2) ESTIMATED CHANNEL STATE INFORMATION
Using an estimated CSI for the analog part in HBF is initially proposed in [107], where a near-optimal beamformer for SU-MIMO systems is designed. The authors in [79] also developed a joint spatial division multiplexing (JSDM) design for MU-MIMO system with HBF employed at the BS which serves single-antenna MD. In this scheme, the MDs sharing the same transmit channel covariance are grouped toether in a cluster to reduce the training and feedback overheads. But, the proposed JSDM approach cannot deal efficiently with the interference between MDs within the same cluster. In [108], the JSDM design was extended to overcome the interference problem and increased the average sum-rate by proposing a modified MMSE algorithm. The majority of the channel estimation methods in HBF systems exploit the sparsity of the mmWave channels in the angular domain [34], [48], [52], [61], [64], [109]- [113]. This is because, the sparsity in the angular domain allows the channels to be modeled by AoA, AoD, and the gain of each path. The sparsity feature also facilitates the use of compressive sensing methods in the channel estimation to estimate the elements of channel with a small number of measurements, resulting in a reduced channel matrix dimension [110]. In general, the channel estimation methods can be classified into open-loop or closed-loop, depending on whether a feedback connection is used or not [111]. In the closedloop method, the receiver determines the best elements from a predefined codebook and feeds them back to the transmitter [34], [52]. In this case, the selection of the codebook can significantly affect system performance. Additionally, large codebook size allows the formation of precise and sharp beams while increasing the feedback overhead. Therefore, the impact of the RF codebooks has been investigated by many studies in the litrature [34], [48], [52], [64], [109].
The studies in [48], [64] show that a simple codebook can be designed by consistently sampling the beam steering space. But, the downside of this scheme is that a resolution of 6 − 7 bits for each phase shifter is required for acceptable performance. Since high-resolution phase shifters suffer from high insertion losses, the codebooks with low-resolution phase shifters are favored in practice. In [112], an efficient hybrid codebook is proposed by employing OMP which needs an analog phase shifters and its acheived performance is based on the number of the RF chains. In [61], the closedloop multi-resolution channel estimation is proposed where VOLUME 8, 2020 wide beams with low directivity gains are used at both the transmitter and receiver at the beginning of the channel estimation. Subsequently, a multi-step power allocation is employed where a high transmit power is used for widebeam and slightly reduced as the beam gets sharper. The main drawback of the multi-resolution channel estimator is the fact that the number of training steps scales up with the number of the multi-path components and the initial high value of power required at the initial operation. In [113], the authors have extended the channel estimation method in [61] to one-sided search which reduces the initial transmit power. Also, the proposed method is based on the ping-pong iterations which eliminated the need for a separate feedback connection. In [111], another open-loop channel estimation method for HBF is proposed, which is based on the multi-grid OMP. Additionally, the authors have proven that the proposed algorithm performs better than the least-squares method with much lower complexity. A different approach has been proposed in [107] to minimize the channel estimation delay and signaling overhead in SU-MIMO HBF by changing the analog beamformer according to the varying parameters of the channel. Moreover, in [79], [114], a single-antenna MU with a BS that employs HBF is considered, where the authors propose a joint spatial division multiplexing method that minimizes the downlink training and uplink feedback overhead. Specifically, the user channels are arranged into groups based on their covariance matrices, and the analog beamformer is devised such that the inter-group interference is minimized, and eventually, the digital precoder separates the user signals in each group.

C. HYBRID BEAMFORMING WITH NARROWBAND AND WIDEBAND CHANNELS 1) NARROWBAND CHANNELS
The HBF with a fully-connected architecture for narrowband SU system is initially investigated in [43]. The study has proven that, initially, when the number of RF chains is twice the number of data-streams, fully digital beamforming performance can be reached by the HBF design. Then, the optimal analog beamforming is designed based on the phase shift corresponding to the elements of the right and left singular vectors of the strongest singular value of the channel matrix. The study showed that when many symbols are transmitted over both uncorrelated and correlated channels, the same performance can be acheived with HBF and fully DBF approaches, respectively. The study in [77], has demonstrated that a beam steering method towards the AoA and AoD can achieve the same performance acheived by the fully-DBF when the number of transceiver antennas is very large. This results from the convergence between the the singular vectors of the channel matrix and the steering vectors towards the AoAs and AoDs. In [48], [54], the joint optimization design of the digital and analog beamformers is proposed based on the matching pursuit method exploiting the mmWave channel sparsity. Initially, the channel singular vectors are calculated, and then, the hybrid beamformer is determined by reducing Euclidean distance between the singular vectors and hybrid beamformer weights. Since the calculations of the singular vectors require intensive computations, the computations can cause critical delays in practice. Furthermore, in [48], it was proven that the spectral efficiency depends significantly on the number of RF chains in the system and the multi-path components of the channel. The studies in [68], [90], [115] considered the iterative HBF algorithms and achieved nearoptimal performance in both sparse and rich scattering channels. Also, studying the tradeoff between beamforming and multiplexing gains is investigated in [116] using iterative precoding design. The proposed algorithm has achieved the channel capacity at low-SNR and full multiplexing gain at high-SNR. Moreover, the design has improved the spectral efficiency by deciding whether to use an additional sub-array to increase the SNR or to transmit a new stream. By exploiting the sparsity of the mmWave channel, the authros in [69] have proposed a codebook-based partially-connected HBF design. However, the codebook-based approaches are only applicable for specific channels and fixed array geometry, meanwhile, a new codebook should be designed when changing either the channel or the geometry.

2) WIDEBAND CHANNELS
For wideband mmWave systems, a codebook-based HBF for SU scenario is proposed in [83]. The proposed method applies Gram-Schmidt orthogonalization for designing the HBF matrices. Additionally, the authors of [56], have applied exhaustive search over a specific codebook for desiging the beamforming vectors. Furthermore, a channel estimation algorithm for a single-stream transmission in mmWave MIMO-OFDM system with the partially-connected HBF is also developed. In [81], the space-division multipleaccess and orthogonal frequency-division multiple-access (SDMA-OFDMA) schemes are leverdged for downlink transmission in MU massive MIMO to maximize the sum-rate. Nevertheless, it calls for more efforts to develop low complexity algorithms for HBF in wideband channels. A combination of frequency scheduling and HBF is expected to be essential in the future research [25], [28].

D. HYBRID BEAMFORMING BASED ON SWITCHES, PHASE SHIFTERS, AND LENS ARRAY ARCHITECTURES
There are many studies investigating the energy and spectral efficiency of the HBF implemented in both phase shifter or switches [74], [76], [87], [117]. Most of these studies tried to determine several key design issues to achieve optimal energy and spectral efficiency, including, the most energy-efficient structure, the optimum number of antennas to realize maximum energy efficiency, and the performance of either phase shifters or switches in terms of energy efficiency. The energy and spectral efficiency of phase shifters based HBF has been intensively studied in [43], [48], [74], [76], [87], [117], [118]. In [118], the insertion losses of the phase shifters based HBF schemes have been assessed under the specified constraints of both the energy and spectral efficiency.
The works in [63], [119] proposed an HBF architecture where a combination of switches and phase shifters are employed. A phase shifters selection method has been proposed in [63] to switch-off half of the phase shifters without degrading the spectral efficiency and consequently reduces the energy consumption significantly. Based on the same concept, the authors in [119], have developed a new design by employing a switching network to reduce the active number of phase shifters while preserving the same spectral efficiency. It has proven that the proposed design is also applicable in the sparse scattering and correlated Rayleigh fading channels. However, both studies only investigated the HBF spectral efficiency without giving much attention to the energy efficiency. In [120], the energy efficiency with distinct structures of switches and phase shifters are investigated for the HBF-based massive MIMO systems. In [63], [119], the authors have derived a closed-form expression for the energy efficiency as a function of the spectral efficiency, enabling a concurrent examination of spectral and energy performance for each structure. This in turn leads to an efficient design of HBF that adjusts smart trade-off between the spectral and energy efficiencies. A beamspace HBF and digital precoder acquisition algorithms that based on compressive sensing for SU mmWave MIMO system are proposed in [121]. The beamspace algorithm has showed a 99.4% complexity reduction compared to the digital precoder acquisition algorithm. However, this study lacks any contributions for MU scenario. The study in [88] integrates the hybrid transceiver with beamspace MIMO to create continuous aperture phased MIMO (CAP-MIMO), which is able to achieve a near-optimal performance. Also, in [122], beam selection and beamspace MIMO concepts are integrated to achieve a near-optimal performance. Additionally, the beam selection concept in [122], has been formed based on different criteria such as, system capacity, path loss, receiver SINR, and minimum error rate. Moreover, to investigate the advantages of beamspace MIMO in terms of capacity improvement, the number of RF chains, spectral and energy efficiencies are also considered in [123]- [125].
In HBF, the digital stage includes the power-greedy DAC and ADC at the transmitter and receiver, respectively. The analog stage consists of a network of phase shifters or switches. In practice, the phase shifters network is implemented using finite resolution DAC, ADC, and phase shifters. Additionally, the higher mmWave frequencies inflict high power consumption due to the required high bitresolution and sampling rate [123]. In [124], the HBF complexity for ADCs resolution is investigated. The study reveals that low-resolution ADC (e.g. 1-bit ADC) has gained considerable atention due to its energy efficiency. However, it also allows rate loss and requires long training sequences for channel estimation. Nevertheless, this study did not address other hardware components such as switches or phase shifters within different HBF architectures. In [78], fully-digital and HBF with 1-bit ADC resolution have been addressed. The authors have showed that utilizing low-resolution ADC is able to achieve comparable performance at low and medium SNR values. Because of the low power consumption, lowresolution ADCs can also be employed at the downlink in the power-limited MDs while using the HBF at the uplink. In [125], the authors have shown that employing 5-6 bits ADC utilization can achieve almost the same performance level as the infinite ADC assuming full-resolution phase shifters in the uplink of MU massive MIMO systems. The study in [126] also shows that employing a 4-5 bits finite resolution HBF acheives better performance in terms of spectral and energy efficiency im comparision with the 1-bit ADC fulldigital beamforming. In [127], the authors have proved that the 1 − 2 bits resolution fully DBF outperforms the HBF at low SNR, while 3 − 5 bits resolution HBF achieves the optimal spectral efficiency and power consumption for a wide range of SNR. Finally, the authros in [128] have studied the trade-off between spectral and energy efficiencies for analog, digital, and hybrid precoding/combining designs. It turns out that, at low SNR, fully analog precoding/combining delivers the best spectral and energy efficiencies. When low-bits ADC fully DBF is compared to HBF, the fully DBF achieves better spectral efficiency.

VIII. OPEN RESEARCH ISSUES
Although different solutions to the HBF design in massive MIMO mmWave systems have been explored, the following issues have not been sufficiently addressed.
• Dual-band small cells: Small cells with dual access can work in licensed and unlicensed bands and anticipated as a fundamental part of 5G networks. So far, it is not clear how traffic should be directed from the licensed band to the unlicensed band and how such integration influences the HBF design [129].
• Antenna selection optimization: The absence of full CSI is a challenge in the HBF context. More investigations are needed for the RF chains and throughput optimization in the presence of CSI errors and mmWave wideband channels [35], [61]. Besides, more efforts are also needed in the optimal selection of antenna arrays [130].
• Handover and mobility management: Mobility control and handover are critical issues in HBF mmWave systems due to the frequent hand-over in mmWave band imposed by its blockage prone nature [131].
• Interference management: Current conventional interference management methods for omni-directional transmission are not suitable for directional beamforming transmission. HBF also affects the interference levels because of its different architecture. The effect for handling these issues have not been sufficiently investigated [132].
• Beamforming for green communications: The research on 5G and beyond networks is mainly focused VOLUME 8, 2020 on satisfying the huge traffic demand and provides seamless connectivity. It is also important to consider the design of energy-efficient systems with minimal footprint [133], [134]. The joint design of energy and spectral efficiencies attracts the most attention [135].
To further leverage the energy efficiency offered by the hybrid beamforming architecture, many studies attempted to determine the optimal tradeoff between the numbers of RF chains and antenna elements [136]. The goal is to optimize system parameters such that the energy efficiency for a specified spectral efficiency can be maximized. The joint design of the digital and analog domains for energy-efficiency and low-complexity, is an interesting point to consider in future research.
• Hybrid beamforming with switches or phase shifters: In this paper, we presented different combinations of phase shifters and switches, and addressed the corresponding trade-off between energy and spectral efficiencies. With ideal assumptions, such as neglecting the phase shifters and switches insertion losses, this architecture can offer improved performance. However, further efforts are needed to investigate the feasibility of such architectures in practice and the tradeoff between energy and spectral efficiencies when the insertion losses are considered.
• Analysis of the channel estimation in hybrid beamforming: In practice, the channel varies over time and channel estimation should be repeated after the channel coherence time is passed. Investigating the tradeoffs between the acceptable estimation errors, the signaling overhead and data transmission within the channel coherence time is essential to realize the HBF.
• Deep learning based hybrid beamforming algorithms: Recently, many studies exploited the machine learning algorithms to solve many traditional communication problems, such as implementing autoencoders at both the transmitter and receiver, signal representation prediction, modulation classification, and channel estimation [137], [138]. Additionally, there are also some studies that manipulated the problem for HBF for simple scenarios [139]- [141]. More efforts should be exerted to solve the joint optimization problems of channel estimation, hybrid precoding, and scheduling. Also, employing deep learning based HBF algorithms in complex scenarios such as UAV-assisted communication scenarios, cell-free massive MIMO scenarios, and NOMA-based backscattering communication scenarios should also be given much more attention in the future.

IX. CONCLUSION
HBF systems have been presented a decade ago but they are given much attention in the past few years due to their essential role in enabling energy and cost-efficient massive MIMO and mmWave systems in 5G networks. HBF combined with massive MIMO and mmWave can solve many technical problems for the 5G network varying from the capacity improvement to the energy efficiency. In this paper, we have presented a comprehensive classification for HBF based on different criterias and scenarios. We also surveyed the most recent works in each of these classifications. Based on the discussions presented in this work, it is evident that there is no single HBF architecture that can provide the best trade-off between performance, complexity, and cost. Consequently, to obtain the best performance out of the hybrid beamforming with the lowest cost and complexity, it is required to dynamically design the architecture according to the channel characteristics and the intended applications.