Intelligent Non-Orthogonal Beamforming with Large Self-Interference Cancellation Capability for Full-Duplex Multi-User Massive MIMO Systems

This work introduces a novel full-duplex hybrid beamforming (FD-HBF) technique for the millimeter-wave (mmWave) multi-user massive multiple-input multiple-output (MU-mMIMO) systems, where a full-duplex (FD) base station (BS) simultaneously serves half-duplex (HD) downlink and uplink user equipments over the same frequency band. Our main goal is jointly enhancing the downlink/uplink sum-rate capacity via the successful cancellation of the strong self-interference (SI) power. Furthermore, FD-HBF remarkably reduces the hardware cost/complexity in the mMIMO systems by interconnecting the radio frequency (RF) and baseband (BB) stages via a low number of RF chains. First, the RF-stage is constructed via the slow time-varying angular information, where two schemes are proposed for both maximizing the intended signal power and canceling the SI power. Particularly, orthogonal RF beamformer (OBF) scheme only aims canceling the far-field component of SI, while non-orthogonal RF beamformer (NOBF) scheme applies perturbations to the orthogonal beams for also suppressing the near-field component of SI channel. Considering the high computational complexity during the search for optimal perturbations, we apply swarm intelligence to find the optimal perturbations. Second, the BB-stage is designed based on only the reduced-size effective intended channel matrices, where the BB precoder/combiner solutions are obtained via regularized zero-forcing (RZF) and minimum mean square error (MMSE). Hence, the proposed FD-HBF technique does not require the instantaneous SI channel knowledge. It is shown that FD-HBF with NOBF+MMSE achieves 78.1 dB SI cancellation (SIC) on its own. Additionally, FD-HBF with the practical antenna isolation can accomplish more than 130 dB SIC and reduce the SI power below the noise floor. The numerical results present that FD-HBF greatly improves the sum-rate capacity by approximately doubling it compared to its HD counterpart.


I. INTRODUCTION
Massive multiple-input multiple-output (mMIMO) has been already one of the key unlocking technologies in the fifth-generation (5G) wireless communication networks [1]- [3]. The third generation partnership project (3GPP) has outlined the utilization of 256 antennas in Release 16 [4]. The excessively large antenna arrays enable three-dimensional (3D) beamforming, which is especially necessary for focusing the signal energy in limited scattering propagation experienced in the millimeter-wave (mmWave) frequency bands [5]. Additionally, the shorter wavelengths in mmWave allow employing a large number of antennas in the practical mMIMO systems with space limitations on antenna arrays. By means of extremely wide bandwidth and large antenna arrays, the mmWave mMIMO technology enables enhanced mobile broadband (eMBB) In the literature, various HBF techniques (i.e., hybrid precoding (HP) in downlink, hybrid combining (HC) in uplink) are investigated for HD transmission in [24]- [33] and FD transmission in [34]- [45] as presented in Table 1.
Regarding the HD downlink transmission in MU-mMIMO systems, [24]- [27] employ the full-size instantaneous fast time-varying CSI to design the RF-stage in the HP architecture, whereas [28]- [33] utilize the slow time-varying channel characteristics. Particularly, an eigen-beamforming based HP technique is developed by using the channel covariance matrix in [28], [29], where the RF-stage design requires the utilization of both phase-shifters and variable-gain amplifiers. Afterwards, an angular-based HP technique is proposed in [30], where the RF-stage is constructed based on the AoD/AoA information requiring only the low-cost phase-shifters. As shown in [30], the angular-based HP achieves higher sum-rate capacity compared to other state-of-the-art HP techniques in [27]- [29]. Afterwards, [31] and [32] respectively apply swarm intelligence and deep learning based power allocation along with the angular-based HP to maximize the downlink capacity in the MU-mMIMO systems. According to the cloud radio access network (C-RAN) architecture, the angular-based HP is extended for the multi-cell MU-mMIMO systems in [33], where the downlink cooperation strategies among the BSs are considered to mitigate the inter-cell interference.
Regarding the FD communications, the fully-digital beamforming is investigated for the conventional MIMO systems [34], where authors consider the simultaneous downlink and uplink transmission over the same frequency band. The numerical results in [34] present that as the SIC quality improves, FD can double the sum-rate capacity compared to the conventional HD transmission. Afterwards, the FD mMIMO relay systems are designed for the point-to-point backhaul links in [35]- [38], where the relay node with large transmit/receive antenna arrays develops HBF based on the perfect SI channel knowledge. Similarly, the authors in [39] analyze the HBF design in the FD point-to-point mMIMO systems assuming the availability of the perfect SI channel knowledge as well as the full-size intended channels. The proposed HBF technique in [39] closely approaches its fully-digital beamforming counterpart in terms of the sum-rate performance. For supporting multiple downlink UEs via a single FD relay, the FD MU-mMIMO relay system is considered in [40], where the SI channel is modeled as Gaussian noise. Then, [41] investigates the utilization of multiple relays in the FD MU-mMIMO relay systems, where the authors assume the perfect cancellation of SI signal at each FD relay and focus on the suppression of inter-relay interference. In [42], the FD MU-mMIMO systems are modeled to simultaneously support a single downlink UE and a single uplink UE over the same frequency band, where the authors investigate the effect of low-resolution phase-shifters at the RF-stage by using the full-size instantaneous CSI including the SI channel. Afterwards, [43] and [44] investigate the FD MU-mMIMO systems serving multiple downlink/uplink UEs, where both employ the instantaneous SI channel knowledge during the HBF design. Thus, none of the aforementioned works design the HBF technique to enhance the SIC quality without instantaneous SI channel knowledge. Recently, the FD mMIMO systems are  [24]- [27] --N/A [28]- [33] --N/A [34] [35]- [39] [40], [41] [42]- [44] [45] This Work analyzed for the wireless point-to-point backhaul link (i.e., single-user) in [45], where the proposed HBF technique aims to enhance the SIC quality by using the slow time-varying AoD/AoA information at the RF-stage and the reduced-size effective CSI at the BB-stage. In other words, the HBF design in [45] does not depend on the instantaneous SI channel knowledge, while it also reduces the channel estimation overhead size. Additionally, it utilizes orthogonal RF beamforming (OBF) technique at the RF-stage for both maximizing the capacity and improving the amount of SIC. Similarly, the fully-analog beamforming is investigated for the FD mMIMO systems in [46], where the authors apply the OBF technique in the point-to-point non-coherent communications.

B. CONTRIBUTIONS
This work proposes a new full-duplex hybrid beamforming (FD-HBF) technique for the mmWave MU-mMIMO systems, where an FD BS simultaneously serves multiple HD downlink and uplink UEs over the same frequency band. The proposed FD-HBF technique has five main objectives: (i) maximizing the intended signal power, (ii) enhancing the SIC quality, (iii) mitigating the interference experienced among UEs, (iv) reducing the hardware cost/complexity, (v) decreasing the channel estimation overhead. Table 1 presents a detailed summary of this work in comparison to the beamforming techniques in the literature.
The main contributions of this work are summarized as: • Intelligent Non-Orthogonal RF Beamformer (NOBF): Two schemes are developed in the downlink/uplink RF beamformer design via the slow time-varying AoD/AoA information. First, the OBF scheme is designed for the MU-mMIMO systems, which employs orthogonal beams for both maximizing the intended signal power and canceling the far-field component of SI channel. In order to further improve the SIC quality, the second proposed scheme is non-orthogonal RF beamformer (NOBF), which applies perturbations to the orthogonal beams for also suppressing the near-field component of SI channel. When the exhaustive search is applied for finding the optimal perturbations, the computational complexity becomes extremely high. Hence, we propose to apply the swarm intelligence to find the optimal perturbations in the NOBF scheme with reasonable computational complexity. •  practical antenna isolation techniques, the proposed FD-HBF is capable of reducing the strong SI power below the noise floor by achieving more than 130 dB SIC. Furthermore, we observe that the SIC quality enhances as the transmit/receive array size increases. • Sum-Rate Capacity: By means of the enhanced SIC quality and simultaneous downlink/uplink transmission over the same frequency band, FD-HBF greatly improves the sum-rate capacity compared to its HD counterpart. The numerical results demonstrate that both downlink/uplink sum-rate capacity can be approximately doubled via the proposed FD transmission scheme.

C. ORGANIZATION
The rest of this paper is organized as follows. The system and channel models are described in Section II. The problem formulation for the proposed FD-HBF technique is introduced in Section III. Then, we develop the RF beamformer and BB precoder/combiner solutions in Section IV and Section V, respectively. The extensive illustrative results are presented in Section VI. Finally, Section VII concludes this paper.

D. NOTATION
Bold upper/lower case letters denote matrices/vectors. (·) * , (·) T , (·) H · and · F represent the complex conjugate, the transpose, the conjugate transpose, the 2-norm and the Frobenius norm of a vector or matrix, respectively. I K , E {·}, tr (·) and ∠ (·) stand for K × K identity matrix, the expectation operator, the trace operator and the argument of a complex number, respectively. X (m, n) denotes the element at the intersection of m th row and n th column. X⊗Y denotes the Kronecker product of two matrices X and Y. We use x ∼ CN (0, σ) when x is a complex Gaussian random variable with zero-mean and variance σ.

II. SYSTEM AND CHANNEL MODEL
We introduce the MU-mMIMO system model and 3D geometry-based mmWave channel model in this section.

A. SYSTEM MODEL
A single-cell MU-mMIMO system is considered for joint downlink and uplink transmission as illustrated in Figure 1.
Here, a BS operates in FD mode to simultaneously serve K D downlink and K U uplink single-antenna UEs over the same frequency band. On the other hand, all K = K D + K U UEs operate in HD mode considering the hardware/software constraints on UEs (e.g., low power consumption, limited signal processing and active/passive SIC capability) [9]- [12]. As shown in Figure 2, the BS is equipped with transmit/receive uniform rectangular arrays (URAs) 1 , which are separated by an antenna isolation block for passive (i.e., propagation domain) SIC [13]- [15]. Specifically, the transmit (receive) URA has U ) denote the number of transmit (receive) antennas along x-axis and y-axis, respectively.
In the proposed FD-HBF architecture, we jointly employ HP and HC schemes in the downlink and uplink transmission, respectively. Therefore, we aim to develop four sub-blocks: Here, b D,k ∈ C N D denotes the BB precoder vector for the k th downlink UE. Similarly, b U,k ∈ C N U is the BB combiner vector for the k th uplink UE. In order to support K D downlink (K U uplink) UEs and reduce the hardware cost/complexity, N D (N U ) RF chains are utilized to inter-connect RF-stage and BB-stage with Additionally, it is worthwhile to mention that the RF beamformer matrices (i.e., F D and F U ) are built via low-cost phase-shifters to further reduce the hardware cost/complexity, which brings the constant modulus constraint in the RF beamformer design.
On the other hand, the downlink channel matrix is denoted as T ∈ C K U is the uplink data signal vector and w D = [w D,1 ,· · ·, w D,K D ] T ∼ CN 0, σ 2 w I K D is the complex circularly symmetric Gaussian noise vector. Here, we define P U as the transmit power of each uplink UE. Similar to the downlink data signal vector, the uplink data signal vector is also encoded by i.i.d. Gaussian codebook (i.e., i.i.d. entries of d U follows the distribution of CN (0, P U ), so we have E d U d H U = P U I K U ). Then, the received signal at the k th downlink UE is written as: As seen above, the received signal includes the intended signal, IUI generated for K D − 1 downlink UEs, IUI generated by K U uplink UEs as well as the noise. Thus, each downlink UE is exposed to IUI from K D + K U − 1 UEs in total due to the FD transmission. After some mathematical manipulations, we derive the instantaneous signal-to-interference-plus-noise-ratio (SINR) at the k th downlink UE as follows: For the uplink transmission, the combined signal vector at the BS is given by: where is the effective SI channel seen from the BB-stage (i.e., after applying downlink/uplink RF beamformers). Thus, the combined signal for k th uplink UE can be formulated as follows: Hence, in addition to the intended signal, the combined signal for each uplink UE consists of IUI generated by K U − 1 uplink UEs, noise and strong SI signal due to the FD transmission. Similar to (3), the instantaneous SINR for the k th uplink UE is obtained as:

B. CHANNEL MODEL
All three types of channels illustrated in Figure 1 are modeled in this subsection: (i) intended downlink/uplink channels H D and H U , (ii) SI channel H SI , (iii) IUI channel H IUI .

1) Intended Channel
The mmWave channels experience a limited scattering propagation environment different from the rich scattering in sub-6 GHz channels [5]. Thus, 3D geometry-based stochastic channel model is employed for mmWave communications [49]. According to the URA structure [50], we first define the intended channel vector for the k th downlink UE as follows: where Q D is the total number of downlink paths, τ D,k l and z D,k l ∼ CN 0, 1 Q D are the distance and complex path gain of l th path, respectively, η is the path loss exponent, γ (x) D,k l = sin (θ D,k l ) cos (ψ D,k l ) ∈ [−1, 1] and γ (y) D,k l = sin (θ D,k l ) sin (ψ D,k l ) ∈ [−1, 1] are the angular coefficients reflecting the elevation AoD (EAoD) θ D,k l and azimuth AoD (AAoD) ψ D,k l , a D (·, ·) ∈ C M D is the downlink array phase response vector defined as: a D (γ x ,γ y )= 1,e j2πdγx ,· · ·, e j2πd(M (x) D −1)γx H ⊗ 1,e j2πdγy ,· · ·, e j2πd(M (y) with d = 0.5 is the normalized half-wavelength distance between antennas. Here, θ D,k l ∈ θ D − δ θ D , θ D + δ θ D is the EAoD with mean θ D and spread δ θ D . Also, ψ D,k l ∈ ψ D − δ ψ D , ψ D + δ ψ D is the AAoD with mean ψ D and spread δ ψ D . It is important to remark that all downlink UEs are clustered in a similar geographical region [28]- [30]. Thus, without loss of generality, all downlink UEs experience the same EAoD/AAoD mean and spread (i.e., θ D , ψ D , δ θ D , δ ψ D ) [51]. As expressed in (7), the intended downlink channel is composed of two parts: (i) fast time-varying path gain vector slow time-varying downlink array phase response matrix D,k l . Finally, by using (7), the downlink channel matrix is written as: where Z D = [z D,1 , · · · , z D,K D ] T ∈ C K D ×Q D is the concatenated path gain matrix for all downlink UEs.

VOLUME XX, 2022
Afterwards, similar to (7), the intended channel vector for the k th uplink UE is defined as: (10) where Q U is the total number of uplink paths, τ U,k l and z U,k l ∼ CN 0, 1 Q U are respectively the distance and path gain of l th path, γ 1] are based on the elevation AoA (EAoA) θ U,k l and azimuth AoA (AAoA) ψ U,k l , a U (·, ·) ∈ C M U is the uplink array phase response vector given by: is the AAoA with mean ψ U and spread δ ψ U . By using (10), the intended uplink channel is divided into two parts: (i) fast time-varying path gain vector U,k l . By applying (10), the uplink channel matrix is given by: where Z U = [z U,1 , · · · , z U,K U ] ∈ C Q U ×K U is the concatenated path gain matrix for all uplink UEs.

2) Self-Interference Channel
As shown in Figure 2, The complete SI channel includes two components as [35]- [39], [43], [45], [46]: where H LoS ∈ C M U ×M D is the residual near-field SI channel via line-of-sight (LoS) paths after applying the antenna isolation, H NLoS ∈ C M U ×M D is the far-field SI channel via the reflected non-line-of-sight (NLoS) paths. We first define the residual near-field SI channel via the spherical wavefront instead of the planar wavefront due to the short distance between transmit and receive URAs [45], [52]. Hence, the near-field SI channel between the (m, n) th transmit and (u, v) th receive antennas is given by (please see Figure 2): where ∆ (m,n)→(u,v) is the distance normalized by wavelength between the corresponding antennas, κ is the normalization scalar to satisfy 10 log 10 H LoS 2 F = −P IS,dB as the residual near-field SI channel power with P IS,dB as the amount of SIC achieved by the antenna where C x , C y and C z are the distance between transmit/receive URAs normalized by wavelength along x-axis, y-axis and z-axis, respectively, Θ is the rotation angle of receive URA along y-axis. Afterwards, the far-field SI channel H NLoS is modeled via the planar wavefront and 3D geometry-based stochastic channel model similar to H D and H U . By using (7), (8), (10) and (11), the far-field SI channel is defined as follows: where Q SI is the total number of reflected NLoS paths, τ SI,l and z SI,l ∼ CN 0, 1 QSI are the distance and complex path gain, respectively, SI,D,l = sin (θ SI,D,l ) cos (ψ SI,D,l ) and γ (y) SI,D,l = sin (θ SI,D,l ) sin (ψ SI,D,l ) are calculated via the EAoD θ SI,D,l ∈ θ SI,D − δ θ SI,D , θ SI,D + δ θ SI,D and AAoD ψ SI,D,l ∈ ψ SI,D − δ ψ SI,D , ψ SI,D + δ ψ SI,D .

3) Inter-User Interference Channel
By using (8) and (11), the transmit and receive array phase response vectors turn out to be a single coefficient, when we consider the IUI channel for any single-antenna UE pairs. Hence, the IUI channel between q th uplink UE and k th downlink UE can be simply defined as H IUI k, q = τ −η IUI,k,q z IUI,k,q with q = 1, · · · , K U and k = 1, · · · , K D , where τ IUI,k,q and z IUI,k,q ∼ CN (0, 1) are the distance and path gain for the corresponding UEs, respectively.

III. FULL-DUPLEX HYBRID BEAMFORMING: PROBLEM FORMULATION
In the proposed full-duplex hybrid beamforming (FD-HBF) technique, we aim to jointly enhance the downlink and uplink performance in the MU-mMIMO systems. By employing (3) and (6), we first define the optimization problem for the total downlink/uplink sum-rate as follows: where F D (F U ) represents the set of downlink (uplink) RF beamformers satisfying the constant modulus constraint due to the low-cost phase-shifters. As expressed in (2) and (3), the downlink sum-rate R D (F D , B D ) depends on two sub-blocks in FD-HBF as downlink RF beamformer and BB precoder. On the other hand, the uplink is a function of all four sub-blocks (i.e., downlink/uplink RF beamformers, BB precoder/combiner) due to the presence of SI signal in FD transmission as shown in (5) and (6).
However, the sum-rate maximization given in (17) is a non-convex optimization problem due to the constant-modulus constraint at the RF-stage [21]. Thus, we sequentially develop the RF-stage and BB-stage. According to (2) and (5), the proposed FD-HBF technique has five main design objectives: 1) Maximize the intended downlink and uplink signal power (i.e., |h T D,k F D b D,k | 2 and |b T U,k F U h U,k | 2 ), 2) Enhance the quality of SIC by mitigating the strong SI signal power (i.e., ||b 3) Suppress the IUI signal power in downlink and uplink (i.e., q =k |h T D,k F D b D,q | 2 and q =k |b T U,k F U h U,q | 2 ), 4) Reduce the hardware cost/complexity with the utilization of few RF chains (i.e., N D M D and N U M U ), 5) Decrease the channel estimation overhead size. In the light of the above objectives, the proposed RF beamformer and BB precoder/combiner solutions are discussed in Section IV and Section V, respectively.

IV. RF BEAMFORMER
The downlink and uplink RF beamformers are jointly developed to maximize the beamforming gain in the intended direction while mitigating the strong SI according to the objectives outlined in Section III. Furthermore, instead of using the full-size fast time-varying channel matrices (i.e., H D , H U and H SI ), we only utilize the slow time-varying AoD/AoA information 3 (i.e., A D , A U , A SI,D and A SI,U ) to decrease the large CSI overhead size in MU-mMIMO systems. Additionally, the AoD/AoA information is exploited to minimize the RF chain utilization in the FD-HBF technique.

A. ORTHOGONAL BEAMFORMER (OBF)
In the OBF scheme, our motivation is to enhance the SIC quality by suppressing especially the far-field component of SI channel (i.e., the reflected NLoS paths) via designing the orthogonal beams through the intended direction. Because the far-field component becomes dominant in comparison to the residual near-field component (i.e., LoS paths) after utilizing the antenna isolation [54]. Moreover, the experimental studies in [14] demonstrate that the dominant far-field component deteriorates the quality of antenna isolation based SIC. By using (13) and (16), the effective reduced-size SI channel matrix seen from the BB-stage is written as follows: where the approximate zero condition can be addressed via not only antenna isolation based SIC for the near-field component but also joint downlink/uplink RF beamformer based SIC for the far-field component. Thus, the columns of downlink RF beamformer F D and the rows of uplink RF beamformer F U should be in the null space of the slow time-varying array phase response matrices A SI,D and A SI,U , respectively, in order to suppress the far-field component of SI channel (i.e., Span (F D ) ⊂ Null (A SI,D ) and Span (F U ) ⊂ Null (A SI,U )). Here, the AoD and AoA supports for the far-field SI channel are respectively defined as follows: where θ SI,Ω = θ SI,Ω − δ θ SI,Ω , θ SI,Ω + δ θ SI,Ω and ψ SI,Ω = ψ SI,Ω − δ ψ SI,Ω , ψ SI,Ω + δ ψ SI,Ω represent the elevation and azimuth angle boundaries, respectively, with Ω ∈ {D, U }.
Afterwards, the downlink/uplink RF beamformers also require to maximize the intended signal power in the desired direction. By using (9), the effective downlink channel matrix is defined as For maximizing the intended downlink signal power, we should choose the columns of F D within the subspace spanned by A D (i.e., Span (F D ) ⊂ Span (A D )). Given that A D as a function of slow time-varying AoD information, the AoD support for the downlink channel is written as: where D represent the boundaries of EAoD and AAoD, respectively. Then, by using (12), Similarly, the row of F U should be in the subspace spanned by A U (i.e., Span (F U ) ⊂ Span (A U )). Furthermore, the AoA support for the uplink channel is given by: where are respectively the boundaries of EAoA and AAoA.
Finally, the downlink RF beamformer F D should be chosen from the intersection of Span (A D ) and Null (A SI,D ). Similarly, the uplink RF beamformer F U should be at the intersection of Span (A U ) and Null (A SI,U ). Hence, the design criteria in the OBF scheme is written as: For satisfying aforementioned criteria for F D and F U , by using (8) and (11), we employ unit-power downlink and uplink steering vectors as e D (γ x , γ y ) = 1 Furthermore, in order to cover the complete 3D angular support, the orthogonal quantized angle-pairs are defined as follows: which satisfies the following orthogonality property: By means of the orthogonality, we have M D (M U ) orthogonal downlink (uplink) steering vectors. It is worthwhile to note that orthogonal angle-pairs given in (23) provides the minimum number of steering vectors to span the complete 3D elevation and azimuth angular support. The design criteria expressed in (22) can be satisfied via selecting the orthogonal angle-pairs covering the angular support of intended channel and excluding the angular support of SI channel. By using (19), (20), (21) and (23), the corresponding orthogonal angle-pairs are obtained as: where λ U,n A SI,U converges to 0 for a large receive URA. Hence, by means of extremely narrow beams via large number of antennas, we have the following limit condition for the far-field SI channel: which implies that the SIC quality on the far-field SI channel can be improved by jointly employing the corresponding downlink/uplink steering vectors and accommodating large transmit/receive URAs as in the mMIMO systems.
Considering that N D angle-pairs assures (25) in the downlink transmission, the OBF scheme derives the closed-form solution of the downlink RF beamformer as: Similarly, assuming N U angle-pairs satisfies (25) in the uplink transmission, the OBF scheme obtains the uplink RF beamformer as follows: According to (27) and (28), the downlink and uplink RF beamformers require N D transmit and N U receive RF chains in the proposed FD-HBF architecture, respectively. Furthermore, F D and F U require log 2 (M D ) and log 2 (M U ) bit resolution phase-shifters to realize the steering vectors with the quantized angle-pairs defined in (23). On the other hand, the RF beamformers designed via the OBF scheme satisfy the constant modulus constraint given in (17).

B. NON-ORTHOGONAL BEAMFORMER (NOBF)
The primary SIC objective in the OBF scheme is to mitigate the far-field component via utilizing the appropriate quantized angle-pairs defined in (25). By (18), (26), (27) and (28), the residual effective SI channel after RF beamforming with OBF scheme is approximated as follows: where H LoS as the near-field LoS component remains the same for relatively long time interval due to the fixed transmit and receive URA locations [55]. Different from H NLoS as the fast time-varying far-field NLoS component, an accurate estimate of H LoS can be obtained [55]- [57].
In the non-orthogonal RF beamformer (NOBF) scheme, we propose to optimize the RF beamformers via new non-orthogonal angles to jointly suppress the near-field and far-field components. Given the set of orthogonal angle-pairs Ω,ni with i = 1, · · · , N Ω and Ω = {D, U } in the downlink and uplink RF beamformers given in (27) and (28), respectively, we insert a perturbation to make them non-orthogonal angle-pairs to further suppress the residual SI experienced due to the near-field component given in (29). Afterwards, the new non-orthogonal angle-pairs are given by:  respectively 4 . It is equivalent to the half of distance between the neighboring orthogonal angles given in (23). Moreover, the perturbation is uniformly quantized within the above defined range by χ levels (e.g., β with Ω ∈ {D, U }. Moreover, around each orthogonal angle, we select χ = 5 quantization levels to specify possible non-orthogonal angles defined in (30). For each perturbation boundary, a non-orthogonal angle is chosen to minimize near-field SI power as shown in (46). 4 It is important to highlight that when the BS is equipped with a transmit/receive ULA (as a special case of URA structure), the orthogonal angle-pairs λ = 0 dB, we first observe that the OBF scheme achieves between 7.26 dB and 21.88 dB SIC. For instance, the maximum SIC can be achieved between 3 rd downlink and 2 nd uplink beam index with the orthogonal angles of λ D,3 = 0.25 and λ U,2 = −0.25, respectively. On the other hand, Figure 3(c) reveals that the NOBF scheme enjoys the non-orthogonal angles to further enhance the SIC quality. Specifically, NOBF accomplishes between 18.04 dB and 28.67 dB SIC. For the maximum SIC observed between 3 rd downlink and 2 nd uplink beam index, the non-orthogonal angles are found asλ D,3 = 0.125 andλ U,2 = −0.125 by applying (30) and (46). Additionally, when each downlink/uplink beam index pair is compared, with respect to the OBF scheme, NOBF enhances the amount of SIC within the range of 5.59 dB and 9.51 dB. By increasing further the number of quantization levels inside the perturbation boundary, we can even further improve the SIC quality. Proposition 2: For an arbitrary N D and N U , the NOBF scheme applies the perturbation to all N D and N U orthogonal downlink and uplink as expressed in (30). Afterwards, the non-orthogonal downlink RF beamformer is constructed as: Similarly, we build the non-orthogonal uplink RF beamformer as follows: For finding 2 (N D + N U ) perturbation coefficients with reasonable complexity, we propose two approaches.

1) Sub-Optimal
For reducing the computational complexity, the optimization problem given in (47) is converted into two low-complexity sub-optimal problems for each downlink and uplink RF beamformer vectors as follows: where the main objective is to find the best perturbation coefficients for each non-orthogonal RF beamformer vector. Hence, it individually optimizes each downlink and uplink RF beamformer to mitigate the SI power, whereas (47) jointly optimize them. Here, only χ 2 comparisons are needed for each RF beamformer vector because there are two perturbation coefficients for each one. In total, considering all N D downlink and N U uplink RF beamformer vectors, the total number of comparisons in the sub-optimal solution equals to χ 2 (N D + N U ), while it is χ 2(N D +N U ) for (47). When we use the same numerical example with N D = N U = 8 and χ = 5, the sup-optimal approach reduces the number of comparisons in the exhaustive search from 2.3 × 10 22 to only 400.
After finding the solutions of (33), the corresponding perturbations coefficients are substituted in (30), (31) and (32) to develop the downlink/uplink RF beamformers in the NOBF scheme. Even though the solutions of (33) are effectively calculated to further suppress the SI power, they are not necessary to be the optimal set of perturbation coefficients. In Section VI-A, the effectiveness of the proposed sub-optimal approach is numerically presented.

2) Particle Swarm Optimization (PSO)
Instead of applying exhaustive search to optimize the objective function in (47), we propose to apply PSO algorithm 5 . Here, we employ N p search agents (i.e., particles) to explore the optimization search space of perturbation coefficients with 2 (N D + N U ) dimensions. During T iterations, the particles communicate with each other and move for the exploration of the search space with the aim of reaching the optimal solution. Particularly, we define a perturbation vector for the p th particle at the t th iteration as follows: where p = 1, · · · , N p and t = 0, 1, · · · , T . For a given particle, by substituting (34) into (30), (31) and (32), the non-orthogonal downlink and uplink RF beamformers can be obtained as functions of perturbation vector, i.e., p , respectively. By using (29), the effective near-field SI channel is rewritten as: At the t th iteration, the personal best for the p th particle and the current global best among all particles are respectively 5 As a nature-inspired AI algorithm, PSO employs multiple search agents (i.e., particles) to explore and exploit the optimization search space through iterations [58]- [60]. It depicts the swarming behavior of animals for solving optimization problems. It recently has drawn great attention of researchers by means of its success on convergence for the global optimal solutions in the non-convex problems. Also, PSO is applied for various topics in wireless communications including mMIMO systems [25], [26], [31], [32].
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
As the key element of PSO algorithm, it is necessary to define a velocity vector based on the personal and global best solutions to increase the chance of its convergence. Thus, the velocity vector for the p th particle at the (t + 1) th iteration is calculated as follows [58]: where v is the velocity of the p th particle at the t th iteration, is the diagonal inertia weight matrix. Here, R 1 reflects the social relations among the particles, whereas R 2 indicates the tendency of a given particle for moving towards its personal best [58]. On the other hand, R 3 represents how a particle keeps its current velocity through iterations to balance between exploration and exploitation [59]. By using (37), during the iterations, the position of each particle is updated as follows: where β β β Low ∈ R (2N D +2N U ) and β β β Upp ∈ R (2N D +2N U ) are the lower-bound and upper-bound vectors for the perturbation coefficients, respectively, clip (x, a, b) = min (max (x, a) , b) denotes the clipping function to avoid exceeding the bounds. It is important to remark that β β β Low and β β β Upp are constructed according to the earlier defined boundaries of each perturbation coefficient given in (30) 6 . Furthermore, different from the sub-optimal approach, we here consider each perturbation coefficient as a continuous variable inside its boundary. The proposed PSO based perturbation coefficient optimization is summarized in Algorithm 1. For the initialization (i.e., t = 0), the entries of perturbation vectors β β β (0) p are assumed to be uniformly distributed over [β β β Low , β β β Upp ]. Moreover, the initial velocity of each particle is chosen as v (0) p = 0. During the iterations, we first update the velocity of a given particle via (37), then, its position is updated via (38). At the end of each iteration, we find the personal best β β β Find the current global best as β β β (t) best via (36b). 13: end for Output: β β β (T ) best Finally, the summary of RF beamformer design including both OBF and NOBF schemes is presented in Algorithm 2.

V. BB PRECODER/COMBINER
After designing the RF beamformer, the BB precoder and combiner design only employ the reduced-size effective downlink channel matrix respectively. Therefore, it remarkably reduces the channel estimation overhead size in the MU-mMIMO systems with large antenna arrays. Considering that the number of RF chains in the proposed FD-HBF technique is significantly smaller than the number of antennas (i.e., N D M D and N U M U ), the utilization of effective downlink/uplink channel matrices reduces the total CSI overhead size from It is important to highlight that, different from [35]- [44], the instantaneous SI channel matrix H SI is not required in the proposed BB precoder/combiner design.
We here develop two BB precoder/combiner schemes via applying regularized zero-forcing (RZF) and minimum mean square error (MMSE). As outlined in Section III, the primary objective of both schemes is maximizing the intended downlink/uplink signal power while suppressing the IUI power. Additionally, another objective in the proposed MMSE scheme is further suppressing residual SI power observed by only exploiting the slow time-varying AoD/AoA information of SI channel as in the RF beamformer.

A. REGULARIZED ZERO FORCING (RZF)
According to the well-known RZF technique [19], we first define the BB precoder as follows: where is the normalization scalar for assuring the maximum downlink VOLUME XX, 2022 Algorithm 2 RF Beamformer Design Input: M Ω , θ Ω , ψ Ω , θ SI,Ω , ψ SI,Ω with Ω ∈ {D, U } 1: Construct the angular supports A SI,D , A SI,U , A D and A U via (19a), (19b), (20) and (21) if Sub-Optimal then 10: Find sup-optimal perturbation coefficients β  (17). According to the RZF technique, we here define which aims to eliminate IUI by taking noise power σ 2 w into account for the regularization. It is important to note that the design of X D varies with the precoding scheme 7 . Similarly, the BB combiner is also designed as: where In the case of HD communications, the downlink and uplink transmissions are operated separately, where the received downlink signal given in (2) does not include IUI by uplink UEs and the received uplink signal given in (5) does not experience the strong SI. Then, RZF is developed by minimizing the mean square error (MSE) between the transmitted and received data signals. Hence, in the HD communications, the RZF technique is equivalent to the MMSE solution [19]. In the FD communications, however, 7 When no regularization is applied, the zero-forcing (ZF) develops the BB precoder as B ZF In the case of noise-free HD transmission, ZF is the optimal precoding scheme, however, it could also magnify the noise effect in the case of noisy transmission [19]. On the other hand, the matched filter (MF) constructs the BB precoder as B MF D = ε D H H D by simply selecting X D = I N D . As shown in [19], MF achieves higher capacity than ZF in the noise-dominant transmission (i.e., P D /σ 2 w → 0), while, ZF provides higher capacity as the transmit power increases (i.e., P D /σ 2 w → ∞). By using (39) the RZF based BB precoder/combiner solutions given in (39) and (40) do not aim mitigating the residual SI signal, hence, they do not indicate the MMSE solutions.

B. MINIMUM MEAN SQUARE ERROR (MMSE)
According to the received downlink signal vector given in (1) and the power constraint given in (17), the downlink MSE as a function of BB precoder B D is obtained as: where ε D is the normalization scalar for the transmit power constraint [61]. Then, the optimization problem for minimizing the downlink MSE under the power constraint is given by: Proposition 3: According to (42), the MMSE solution for the BB precoder matrix is derived as follows: where Similarly, by utilizing the combined uplink signalr U given in (4), the optimization problem for minimizing the uplink MSE is defined as a function of BB combiner B U :

Algorithm 3 BB Precoder/Combiner Design
3: 4: 7: Proposition 4: According to (44), the MMSE solution for the BB combiner matrix is derived as follows: where Proof: Please see Appendix D. Algorithm 3 summarizes all BB precoder/combiner schemes for the proposed FD-HBF technique, including MMSE, RZF, ZF and MF schemes.
As a summary of the proposed FD-HBF technique, all five objectives listed in Section III have been jointly addressed during the RF beamformer and BB precoder/combiner design. Particularly, the RF beamformer maximizes the downlink/uplink beamforming gain while suppressing the strong SI signal to enhance the quality of SIC. Furthermore, the RF beamformer design is only based on slow time-varying AoD/AoA information to reduce the channel estimation overhead size. The RF-stage and BB-stage are interconnected with a significantly reduced number of RF chains. Then, by using the reduced-size effective channels H D and H U , the BB precoder/combiner also maximizes the intended signal power while suppressing the IUI power. Furthermore, the MMSE scheme in the BB-stage design further suppresses the residual SI power experienced after the RF-stage.

VI. ILLUSTRATIVE RESULTS
This section demonstrates Monte Carlo simulation results to evaluate the performance of the proposed FD-HBF technique for simultaneous downlink/uplink transmission in the MU-mMIMO systems. Particularly, we first investigate the amount of achieved SIC via the joint RF beamformer and BB precoder/combiner design in the FD-HBF technique. Furthermore, according to (17), we present the total sum-rate R Total = R D + R U with downlink sum-rate R D and uplink sum-rate R U . Based on 3D urban microcell (UMi) scenario described in the latest 3GPP Release 16 [4], [62], [63], Table 2 summarizes the numerical values used in the simulation setup, unless otherwise stated. Also, we consider the transmit and receive URAs are placed on the same surface with C x = 2, C y = 0, C z = 0 and Θ = 0 • (please see Figure 2). It is important to remark that the proposed FD-HBF technique employs only N D = N U = 8 transmit/receive RF chains to support M D = M U = 256 transmit/receive antennas. Therefore, the proposed two-stage hybrid architecture provides 96.88% reduction in the hardware cost/complexity and the channel estimation overhead size in comparison to the single-stage fully-digital beamforming. Figure 4 plots the achieved SIC versus various transmit/receive URA sizes based on the proposed RF beamforming schemes in Section IV. Here, Figure  4(a) and Figure 4(b) respectively investigate the achieved SIC on near-field SI channel H LoS defined in (14) and far-field SI channel H NLoS defined in (16). We first observe that all OBF and NOBF schemes significantly mitigates VOLUME XX, 2022  the SI signal power after RF beamforming relative to the no RF beamforming scenario 9 . Especially for the larger URAs in mMIMO systems, the amount of SIC increases on both SI channel components by means of enhanced beamforming gain in the intended direction and limited side-lobes towards the SI paths. Particularly, the NOBF scheme with PSO performs the highest SIC on the near-field SI channel as seen in Figure 4 On the other hand, the NOBF scheme with sub-optimal perturbations is located between OBF and NOBF with PSO, while it converges to OBF for the large URAs due to its sub-optimality. Furthermore, the gap between the NOBF schemes improves as the URA size increases. To illustrate, when the URA size is M D = M U = 16, 64, 256, 400, NOBF with PSO achieves 2.7 dB, 4.4 dB, 11.2 dB, 13.6 dB higher in near-field SIC as compared to NOBF with sub-optimal, respectively. However, by breaking the orthogonality property, the NOBF schemes might experience a slight degradation for the achieved SIC on the far-field SI channel as shown in Figure 4(b). For example, with M D = M U = 256 antennas, OBF and NOBF with PSO obtain the SIC of 92.8 dB and 90.5 dB, respectively. In other words, NOBF encounters 2.3 dB degradation in the far-field SIC. Hence, it brings an interesting trade-off between OBF and NOBF for improving either near-field or far-field SIC. Figure 5 demonstrates the achieved SIC versus the antenna isolation P IS,dB ∈ [0 dB, 100 dB]. Henceforth, we consider BS with M D = M U = 16 × 16 = 256 transmit/receive antennas. As mentioned earlier, the near-field SI channel power without RF beamforming is 9 As a reference, no RF beamforming scenario is set to 0 dB (i.e., On the other hand, as expected, the far-field SI channel power is independent of the antenna isolation. Therefore, the dotted curves remain constant across all P IS,dB values. Afterwards, the solid curves present the complete SI channel power including both near-field and far-field components as expressed in (13). The first critical observation is the dominance of near-field component under the limited antenna isolation based SIC. On the other hand, the far-field component becomes dominant as P IS,dB increases. Here, we analyzes two following scenarios:

A. SELF-INTERFERENCE CANCELLATION
• No RF Beamforming: Initially, the complete SI channel power without antenna isolation (i.e., P IS,dB = 0 dB) is observed as 10 log 10 ( H SI 2 F ) = 0 dB. Although the antenna isolation affects the SIC quality, the NLoS power as 10 log 10 ( H NLoS 2 F ) = −16.8 dB dominates complete SI channel power for P IS,dB ≥ 30 dB. Hence, even though we dramatically increase the quality of antenna isolation up to P IS,dB = 100 dB without any RF beamforming based SIC, the achieved SIC on the complete SI channel is only limited by 16  Similarly, by using the uplink received signal given in (5), the intended signal power and the SI power are plotted for the uplink transmission. For the noise power spectral density (PSD) and channel bandwidth given in Table 2, the noise floor is at −101 dBm. In the proposed FD-HBF technique, we develop two schemes at the RF beamformer (i.e., OBF and NOBF 10 ) and two schemes at the BB precoder/combiner (i.e., RZF and MMSE). Thus, the proposed FD-HBF technique may jointly design RF-stage and BB-stage via four possible schemes: (i) NOBF+MMSE, (ii) NOBF+RZF, (iii) OBF+MMSE, (iv) OBF+RZF. Given the downlink transmit power at BS as P T = 30 dBm, the numerical results show the proposed FD-HBF with NOBF+MMSE reduces the SI power to −48.1 dBm without any antenna isolation (i.e., P IS,dB = 0 dB). Thus, NOBF+MMSE accomplishes 78.1 dB SIC on its own. Furthermore, when the antenna isolation is larger than 55 dB, NOBF+MMSE reduces the SI power below the noise floor. On the other hand, OBF+MMSE requires at least 65 dB antenna isolation for keeping the SI power below the noise floor. When we only analyze the BB precoder/combiner design, it is seen that both MMSE schemes achieve approximately 1.3 dB higher 10 In the rest of the paper, We only consider NOBF with PSO approach. SIC compared to their RZF counterparts. Although the downlink/uplink intended signal powers are comparable for all schemes, the OBF schemes can offer approximately an improvement of 0.5dB as compared to the NOBF schemes. We also observe that the IUI from the uplink UEs to the downlink UEs is 10.8 dB below the noise floor.
As shown in [13]- [15], the practical antenna isolation techniques can achieve up to 60 − 70 dB cancellation. Hence, it implies that the proposed FD-HBF technique with NOBF+MMSE can reduce the SI power below the noise floor under practical antenna isolation assumptions. Moreover, given the BS downlink transmit power of P D = 30 dBm and SI power below the noise floor at −101 dBm, NOBF+MMSE can achieve more than 130 dB SIC.

B. SUM-RATE PERFORMANCE
During the sum-rate analysis of MU-mMIMO systems, we compare the performance results of the proposed FD-HBF technique with its HD counterpart. As a benchmark scheme, we consider the angular-based HP technique 11 in [30], which only considers the downlink transmission via applying OBF at the RF-stage and MMSE at the BB-stage. Although [30] does not address the uplink transmission, it nevertheless serves as a benchmark. By developing the angular-based HC technique for the uplink transmission, we generalize [30] as an angular-based HBF technique for HD downlink/uplink transmission. Hence, it is called HD-HBF technique. It is important to remark that the HD downlink and uplink transmissions are carried out over either different time-slots or different frequency bands. Therefore, the downlink, uplink and total sum-rate in the HD transmission are normalized as R D,HD = 1 2 R D , R U,HD = 1 2 R U , and R Total,HD = R D,HD + R U,HD , respectively. Figure 7 compares the sum-rate performance of the proposed FD-HBF and HD-HBF [30], where the BS serves K D = K U = 4 downlink/uplink UEs. Particularly, the black solid curves plots the total sum-rate R Total = R D + R U , while the red dashed (blue dotted) curves present the downlink (uplink) sum-rate R D (R U ). When the BS operates in FD mode, it can simultaneously serve K = K D + K U = 8 UEs over the same frequency band. On the other hand, when the BS operates in HD mode, two orthogonal time/frequency resources are necessary to individually serve K D = 4 downlink and K U = 4 uplink UEs. In Figure 7(a), we observe that the proposed FD-HBF with NOBF+MMSE scheme achieves the highest total sum-rate up to P IS,dB = 70 dB antenna isolation among the FD transmission schemes. For instance, regarding the FD transmission, NOBF+MMSE, NOBF+RZF, OBF+MMSE and OBF+RZF respectively performs the total sum-rate capacity of 47.8 bps/Hz, 45.8 bps/Hz, 38.8 bps/Hz and 37.7 bps/Hz at P IS,dB = 55 dB.  Moreover, the proposed FD-HBF with NOBF+MMSE starts outperforming HD-HBF after P IS,dB = 25 dB, where both approximately provide 28.2 bps/Hz sum-rate capacity. In Figure 7(b), we evaluate the FD-to-HD sum-rate ratio by calculating total/downlink/uplink sum-rate ratio of FD and HD transmission modes (i.e., 0 ≤ RΩ,FD RΩ,HD ≤ 2 with Ω ∈ {Total, D, U }). It is seen that NOBF+MMSE achieves 1.91 FD-to-HD sum-rate ratio around P IS,dB = 70 dB. Moreover, as the antenna isolation improves, the FD-HBF with OBF+MMSE provides 1.95 times higher sum-rate capacity in comparison to HD-HBF. Additionally, the antenna isolation quality only affects the SI power experienced in the uplink received signal, therefore, the downlink sum-rate is independent from the antenna isolation. The numerical results also bring an interesting analogy among OBF and NOBF schemes for downlink and uplink transmission. To illustrate, by further suppressing the SI power, NOBF schemes greatly enhance the uplink capacity compared OBF schemes, whereas OBF achieves slightly better downlink capacity by means of the orthogonality property. Overall, NOBF schemes are more favorable by providing higher total sum-rate capacity under the practical antenna isolation levels  up to 60 − 70 dB [13]- [15].
In Figure 8, we present the sum-rate performance versus the number of downlink/uplink UEs, where the antenna isolation is considered as P IS,dB = 60 dB [13]- [15]. Here, we monitor that the proposed FD-HBF with all possible schemes remarkably outperforms HD-HBF. However, as the number of UEs increases, there is a slight degradation in the FD-to-HD sum-rate ratio shown in Figure 8(b). For instance, when there is K D = 1 downlink UE and K U = 1 uplink UE, the proposed FD-HBF with NOBF+MMSE approximately doubles the capacity in comparison to its HD counterpart. Under the same scheme, the FD-to-HD sum-rate ratio is seen as 1.75 for K D = K U = 6 downlink/uplink UEs. In other words, the proposed FD-HBF technique can increase the capacity by more than 75% with respect to the conventional HD transmission. Moreover, both NOBF schemes have higher capacity compared to their OBF counterparts in the uplink transmission by means of enhanced SIC quality (please see Figure 5). However, the FD-to-HD sum-rate ratio decays with the larger number of UEs due to the increased interference power. On the other hand, in the downlink transmission, both OBF schemes  achieve approximately the same sum-rate performance and they become slightly superior to the NOBF schemes with the increasing number of UEs. Finally, Figure 9 illustrates the FD-to-HD sum-rate ratio versus BS transmit power P D and UE transmit power P U , where there are K D = K U = 4 UEs. Also, the antenna isolation is set to P IS,dB = 60 dB [13]- [15]. Given the maximum BS transmit power as 35 dBm at mmWave frequencies [62], the BS transmit power range is considered as P D ∈ [0, 35] dBm. Although BS and UE have different hardware constraints, for the sake of simplicity, we also apply the same range for the UE transmit power (i.e., P U ∈ [0, 35] dBm). Here, we only consider the proposed FD-HBF technique with NOBF+MMSE scheme. When the total sum-rate ratio presented in Figure 9(a) is analyzed, we observe that the proposed FD transmission technique can closely double the capacity with respect to the conventional HD transmission scheme. Even though the FD-to-HD total sum-rate ratio decays for higher BS/UE transmit power, it is seen that the proposed FD transmission technique enhances the capacity at least by 62%. On the other hand, the downlink sum-rate ratio improves, when P D increases as shown in Figure 9(b). Nonetheless, it only drops below the unity for P U = 35 dBm and P D = 0 dBm (i.e., R D,FD R D,HD = 0.90 and HD-HBF provides higher capacity than FD-HBF), where the large uplink power boosts the IUI power in comparison to the low downlink intended signal power (please see (2)). Similarly, the uplink sum-rate ratio results in Figure 9(c) demonstrate that the increased SI power due to the high BS transmit power might negatively affect the uplink transmission. To illustrate, at P U = 25 dBm, the FD-to-HD uplink sum-rate ratios are exactly 2.00 and 1.44 for P D = 0 dBm and P D = 35 dBm, respectively. On the other hand, we observe that a power control mechanism might be necessary to enhance both downlink and uplink sum-rate ratios.

VII. CONCLUSIONS
In this work, a novel full-duplex hybrid beamforming (FD-HBF) technique has been proposed for the MU-mMIMO systems. In the HBF architecture, the RF beamformer has been developed via the slow time-varying AoD/AoA information of both intended and SI channels. During the RF beamformer design, the OBF scheme with orthogonal beams have been first developed for maximizing the beamforming gain towards the intended direction and canceling the far-field component of SI channel. Additionally, we have introduced the NOBF scheme by applying perturbations to the orthogonal beams for also suppressing the near-field component of SI channel. By showing the impractical computational complexity of exhaustive search on finding the optimal set of perturbations, we have proposed to employ the swarm intelligence aiming to find the optimal perturbations. Afterwards, the BB-stage has been designed via only the reduced-size effective intended channel matrices, where the BB precoder/combiner solutions have been derived via RZF and MMSE techniques. Finally, the proposed FD-HBF technique has addressed the five following objectives: (i) maximizing the intended downlink/uplink signal power, (ii) improving the SIC quality by successfully suppressing the strong SI power, (iii) mitigating the IUI power experienced in the downlink/uplink, (iv) reducing the number of RF chains for the mMIMO systems with large antenna arrays, (v) decreasing the CSI overhead size. The numerical results demonstrate that FD-HBF with NOBF+MMSE significantly suppresses the strong SI power by providing 78.1 dB SIC on its own. Furthermore, along with the practical antenna isolation, FD-HBF can reduce the SI power below the noise floor by achieving more than 130 dB SIC. On the other hand, the proposed FD-HBF technique greatly enhances the sum-rate capacity and approximately doubles it compared to its HD counterpart. .

VOLUME XX, 2022
As shown in Figure 1, when a single transmit/receive RF chain is utilized (i.e., N D = N U = 1), the downlink and uplink RF beamformers turn into a single-column vector. In the NOBF scheme, the non-orthogonal angle-pairs given in (30) are utilized to develop the downlink RF beamformer U,n1 ∈ C M U . In order to obtain the non-orthogonal angle-pairs, it is necessary to find the perturbation coefficients minimizing the power of effective SI channel expressed in (29) as follows: where χ options are present for each of four perturbation coefficients. Thus, the total number of downlink/uplink RF beam combinations is χ 4 . However, as a special case of URA, only two perturbations coefficients are present for ULAs (e.g., M U,n = 0), which reduces the number of comparisons to χ 2 .

APPENDIX B PROOF OF PROPOSITION 2
The downlink RF beamformer in the OBF scheme is designed via the set of N D orthogonal downlink angle-pairs λ (x) D,mi ,λ (y) D,ni with i = 1, · · · , N D as given in (27). In the NOBF scheme, we apply the perturbation to the orthogonal angle-pairs as shown in (30). Afterwards, the non-orthogonal downlink RF beamformer is constructed as in (31).
It is crucial to note that we need the optimal perturbation coefficients (i.e., β U,nj ) to develop the non-orthogonal RF beamformers suppressing the effective SI channel power. In other words, according to (29), we aim to suppress F U H LoS F D 2 F . Thus, the optimization problem for the optimal perturbation coefficients is formulated as follows: arg min where there are 2 (N D + N U ) perturbation coefficients to be jointly optimized. Given χ options per each perturbation coefficient, the exhaustive search for the above optimization problem requires χ 2(N D +N U ) , which brings a great computational complexity.

APPENDIX C PROOF OF PROPOSITION 3
By using (1) and (41), the downlink MSE can be expanded as follows: where we utilize E d D d H D = I K D , E d U d H U = P U I K U and the linearity of trace operator. Here, the IUI channel matrix from uplink to downlink UEs is not known at the BS. Thus, instead of directly using instantaneous H IUI , the MSE only utilize its expected value as E H IUI H H IUI . According to the maximum downlink transmit power constraint expressed in (42), the Lagrangian function is constructed as: where α ∈ R is the Lagrangian multiplier. For finding the solution of (42), it is necessary to set the following derivatives/gradients of the Lagrangian function to zero as Then, the BB precoder is found as: where X D = H H D H D + αε 2 D F H D F D depends on α and ε D . Thus, it is necessary to find their closed-form solutions as well. The derivative of the Lagrangian function with respect to α is obtained as follows: By substituting (51) into (52), the normalization scalar is found as: By using (51) and (53), we can write the following equality: Moreover, according to the IUI channel model presented in Section II-B3, we obtain: Then, by applying (55) and (56) into (54), the closed-form of the Lagrangian multiplier is derived as follows: Finally, the proof of (43) is concluded by combining (51), (53) and (57).

APPENDIX D PROOF OF PROPOSITION 4
By combining (4) and (44), the uplink MSE is rewritten as: Although the current form of uplink MSE depends on the instantaneous perfect SI channel matrix H SI , the proposed FD-HBF technique does not utilize it considering the practical implementation under various hardware imperfections as well as estimation errors [9]- [12]. Instead, as in the RF beamformer design presented in Section IV, the slow time-varying AoD/AoA information is utilized in the MMSE based BB combiner. Hence, the following expression is approximated as follows: where (a) is approximated by applying (18) and the dominant far-field component after antenna isolation based SIC, (b) is obtained via the expectation of the complex path gain matrix Z SI with Υ SI = 1 τ η SI √ QSI F UÂSI,UÂSI,D F D B D ∈ C N U ×K D . Here, we definē τ SI as the average distance of far-field components (i.e., reflected NLoS paths). Moreover,Â SI,U andÂ SI,D are the approximated slow time-varying array phase response matrices, which only requires AoD/AoA mean and spread instead of exact AoD/AoA knowledge. Therefore, one can easily buildÂ SI,U andÂ SI,D by substituting Q SI random paths into (8), (11) and (16). By combining (58) and (59), the gradient of uplink MSE with respect to B U is obtained as: Finally, we derive the BB combiner satisfying the MMSE criterion (i.e., ∂MSE U (B U ) ∂B U = 0) as expressed in (45).