Self-Interference Cancellation and Channel Estimation in Multicarrier-Division Duplex Systems With Hybrid Beamforming

Design of full-duplex (FD) wireless systems faces many challenges, including self-interference cancellation (SIC), capability to provide high capacity, high flexibility for operation, best usage of resources, etc. In this paper, we propose and investigate a multicarrier-division duplex (MDD) based hybrid beamforming system operated in FD mode, which is endowed with the advantages of both time-division duplex and frequency-division duplex. It also shares some merits of FD and allows to be free of self-interference (SI) in digital domain, but faces the same challenge of SI as the FD in analog domain. Hence in this paper, we first propose an adaptive beamforming assisted SI cancellation scheme with taking into account the practical requirement of analog-to-digital conversion (ADC). It can be shown that the proposed approach is capable of jointly coping with the desired signals’ transmission and SI suppression. Then, channel estimation (CEst) in MDD/MU-MIMO system is proposed by exploiting the reciprocity between the uplink and downlink subcarrier channels that is provided by MDD. Correspondingly, the orthogonality-achieving pilot symbols are designed, and the least-square (LS) CEst as well as linear minimum mean-square error (LMMSE) CEst are derived. Finally, the performance of MDD/MU-MIMO systems employing the proposed SIC method is investigated, with respect to the SI cancellation capability, sum-rate potential, CEst performance, and the effect of CEst on the achievable performance. Our studies show that MDD/MU-MIMO provides an effective option for design of future wireless transceivers.


I. INTRODUCTION
Achieving the highest possible spectral efficiency to meet the demand of ever increasing data rate has always been a top priority in the design of wireless communication systems, especially, of the fifth-generation (5G) (and beyond) wireless systems [1], [2]. Currently, all wireless networks are operated in half-duplex (HD) mode, based on either timedivision duplex (TDD) or frequency-division duplex (FDD). Specifically in cellular wireless systems, downlink (DL) The associate editor coordinating the review of this manuscript and approving it for publication was Wenchi Cheng . and uplink (UL) transmissions are supported by different time-slots or different frequency bands. However, these HD modes have some weakness in terms of system performance and complexity, hindering them from employment in future wireless systems. For instance, although FDD enables DL and UL transmissions at the same time and avoids interference between DL and UL, the complexity and overhead for channel estimation (CEst) in large-scale antenna systems may be unaffordable, due to the incoherence of UL/DL channels. On the other hand, TDD can benefit from the reciprocity between DL and UL channels, making it possible that only base station (BS) needs to estimate the UL channels, while the VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ DL channel state information (CSI) can be derived from the UL CEst. However, in TDD-based systems, guard time between DL/UL switching is required, which may be significant in future broadband wireless systems, yielding serious delay and inefficient usage of resources [3]. Due to the limitation of TDD and FDD, full duplex (FD), more specifically in-band full duplex (IBFD), has received increasing interest in recent years, owing to its potential to nearly double the capacity of a wireless channel [4]- [6]. However, to make IBFD systems feasible, it is critical to efficiently suppress the self-interference (SI) generated by a node's transmitted signal on its own received signal, which is usually significant in real wireless systems. According to [4], for a IBFD-based BS to achieve a link signal-to-noise ratio (SNR) equating to that of a HD counterpart, it is typically required to suppress SI by about 110 dB. To achieve this target, various SI cancellation (SIC) techniques, operated in propagation-, analog-or/and digital-domain, have been proposed and investigated in literature, as seen, for example, in the papers [4]- [9].
On the other hand, the heterogeneous wireless systems in 5G and beyond will in no doubt rely on multicarrier signaling, in the principles of orthogonal frequency division multiplexing (OFDM) and non-orthogonal multiple access (NOMA). Owing to this consideration, an out-band FD (OBFD) scheme, referred to as multicarrier division duplex (MDD), has been proposed in [10] and further studied in [11], [12]. The studies of [12] show that MDD is capable of exploiting the advantages of both FDD and TDD, and employs the potential to outperform both the HD and IBFD schemes in some application scenarios. To be more specific, MDD may be deemed to be a frequency-domain counterpart of TDD and hence, employs all the advantages of TDD. For example, instead of a flexible UL/DL time-slot allocation, MDD can rely on the flexible UL/DL subcarrier allocation to attain the reciprocity between UL and DL. Like TDD, MDD is also flexible to support asymmetrical UL/DL traffics by assigning the corresponding number of subcarriers to UL/DL. However, MDD may outperform TDD by saving the guard-time required by the TDD for UL/DL switching. MDD can also inherit the merits of FDD. For example, under FDD, there is no switch-over between the transmission and receiving of a node on a same frequency. Similarly in MDD, there is also no switch-over of this kind, provided that the subcarriers assigned to a node for receiving and transmission are not changed. MDD also outperform FDD in terms of CEst, as MDD is capable of exploiting the reciprocity between UL and DL for CEst, while the UL/DL channels in FDD are not reciprocal. In other words, relying on MDD, the CSI required for DL transmitter preprocessing can be obtained from the CSI estimated from the UL receiving, with the aid of the frequency-domain correlation of wireless channels. Furthermore, since in MDD systems, UL channel training and DL transmission happen concurrently, MDD mode can substantially mitigate the channel aging problem that TDD/FDD modes experience, when channel's time-varying becomes relatively fast [13].
In comparison with the IBFD schemes [4], [5], where transmission and receiving happen at the same time and on the same frequency (subcarrier), MDD allows transmission and receiving to use the same time and frequency resources but different subcarriers. This arrangement allows to relax the SIC requirement of IBFD, and achieve the SI-free signal processing in digital-domain. Nevertheless, MDD is a type of FD, it succeeds the challenges of general FD systems, such as the effect of SI on receiver's ADC. However, MDD has its advantage that SI may be near-perfectly removed in the digital domain, provided that the SI is sufficiently suppressed in the propagation/analog domains, so that the received signals (SI plus desired signal) fall within the dynamic range of receiver's ADC. In other words, once SI is sufficiently suppressed in propagation/analog domains so that the total received signals can be operated within the ADC's dynamic range, the FFT operation in receiver is able to help to remove all the residual SI without any extra complexity. Therefore, in MDD-based systems, SIC is only required in the propagation and analog domains.
Owing to the merits of MDD as above-mentioned, in this paper, we investigate the efficient SIC methods for the MDD-based multiuser multiple-input multiple-output (MDD/MU-MIMO) system, as well as its channel estimation by leveraging the correlation existing between DL and UL subcarriers.

A. RELATED WORKS
From literature [14]- [19] we know that in the context of the conventional FD MIMO systems, various SIC methods in the propagation and analog domains have been proposed. These SIC methods may also be introduced to the MDD/ MU-MIMO systems. However, when MDD/MU-MIMO system is in the large-scale, explained by the number of transmit/receive antennas, the traditional SIC approaches for MIMO may not be suitable, due to the consequence of huge overhead and complexity. Fortunately, in this case, the big number of antennas can be leveraged for SI suppression. To this end, beamforming based SIC has become one of the most important methods for SI reduction. Specifically in [20], a full-digital precoder has been designed to point SI signals to the null space of desired received signals and, thereby, cancel the SI signals in analog domain. By properly designing the full-digital precoders, SI can also be suppressed along with the maximization of sum-rate [21], [22]. Instead of full-digital precoding, as shown in [23], hybrid precoding is capable of achieving the SI reduction of upto 30 dB. However, to the best of our knowledge, on the joint design of desired signal transmission and SIC, all the beamforming based SIC methods presented so far only provide a fixed amount of SI reduction. This may not satisfy the different requirements of analog-domain SI reduction in practice and, consequently, causes large quantization noise after ADC, if SI reduction is insufficient. Furthermore, in these references, only the point-to-point single-carrier MIMO communication scenarios have been considered. Additionally, in [12], the MDD-assisted point-to-point multicarrier MIMO system employing full-digital precoder/combiner has been proposed and studied, demonstrating that MDD-mode is capable of outperforming HD and IBFD modes in some communication scenarios. However, when antenna arrays become large, the full-digital SIC methods are no longer feasible in hybird beamforming systems. Hence, to satisfy the requirement of ADC in various communication environments, the study on the dynamic SI reduction in the hybrid beamforming assisted IBFD or OBFD systems is highly important, but has not been considered in open literature.

B. CONTRIBUTIONS
Against the background, in this paper, we study the MDD/MU-MIMO hybrid beamforming systems associated with the following issues. Firstly, the SIC requirement for ADC to efficiently operate in its dynamic range is modeled. Then, to make MDD-mode feasible in large-scale MU-MIMO systems, we propose an adaptive beamforming based SIC method via analog precoder/combiner design in order to dynamically suppress SI, which has not been investigated in the open literature. Various design options are addressed by taking account of the trade-off between performance and complexity. Furthermore, zero-forcing (ZF) and minimum mean squared error (MMSE) algorithms are respectively introduced to design the digital precoder and combiner. Our studies show that the proposed beamforming SIC method are capable of simultaneously suppressing the SI while maintaining the performance of desired DL/UL communications, when appropriate initialization is applied in the proposed algorithms. Furthermore, our proposed method is capable of providing SIC over a big dynamic range upto 300 dB, which is achieved via applying different system configurations and an appropriate number of SIC iterations.
Secondly, we address the CEst in MDD/MU-MIMO systems, and consider the estimation of both UL and DL channels by exploiting the correlation existing among subcarriers and the reciprocity existing between the UL and DL channels. To be more specific, we first design a set of frequency-domain orthogonality-achieving pilot symbols (PSs) for estimating the UL/DL channels of all mobile stations (MSs). Based on these orthogonality-achieving PSs, the least square (LS) CEst is implemented to obtain the time-domain UL channel impulse response (CIR), from which the UL/DL CIRs of all subcarriers are derived with the aid of the above-mentioned subcarrier correlation and reciprocity of UL/DL channels. Furthermore, we consider the CEst in the scenario where employing orthogonality-achieving PSs is impossible due to a big number of MSs and/or randomly distributed UL subcarriers. Correspondingly, the CEst is accomplished in the principles of linear minimum mean square error (LMMSE).
Finally, we investigate and compare the performance of the MDD/MU-MIMO systems. First, we demonstrate the performance of the proposed SIC schemes. Then, the spectral-efficiency performance of MDD/MU-MIMO systems with various beamforming aided SIC options is investigated and compared. Furthermore, the performance of the proposed CEst schemes is demonstrated, and the impact of CEst on the achievable spectral-efficiency is studied and compared. Our studies show that the LS CEst relying on the orthogonality-achieving PSs enables the MDD/MU-MIMO system to achieve nearly the same sum-rate as the counterpart system with perfect CSI. In the case that orthogonality-achieving PSs are impossible, the LMMSE CEst is promising, allowing to achieve the sum-rate close to that attained under the assumption of perfect CSI.
The novelty of our work is compared with the related works in Table 1. Note that in addition to the differences stated in the table, the channel estimation in MDD-based systems was not considered in [12]. The contributions of this paper can be summarized as follows: • As both IBFD and OBFD have to implement efficient SIC to make the FD-relied operations feasible, an adaptive beamforming based SIC scheme is proposed to achieve the ADC's requirement for SI reduction. The proposed SIC scheme is capable of providing a big range of SIC in analog domain while at only little cost of system performance. We also demonstrate the impact of the Rician factor of SI channel, the number of antennas and the angle between TX/RX antenna arrays on the performance of SIC, from which we can gain insight for the system configuration of not only the MDD-based systems but also other IBFD systems.
• Hybrid precoding and combining schemes with SIC capability are designed for MDD/MU-MIMO systems. VOLUME 8, 2020 The studies and performance results, including the robustness of proposed SIC scheme and the performance of MDD/MU-MIMO systems, convince us the effectiveness of the designed schemes and also the merits of MDD-relied systems.
• By leveraging subcarrier correlation and reciprocity between UL and DL channels, an efficient CEst method is presented for obtaining the CSI of both UL and DL channels. Specific subcarrier allocation for frequency-domain pilot transmission is proposed to benefit CEst. Furthermore, to reduce system overhead in more general scenarios, a CEst method based on LMMSE is presented and studied, which is also capable of allowing MDD/MU-MIMO systems to achieve promising performance.
The rest of the paper is organized as follows. In Section II, we address the modeling in MDD/MU-MIMO systems, including transmitter, channel and receiver models, as well as the modeling of ADC. In Section III, the hybrid beamforming is designed with the objective to maximize sum-rate while simultaneously meet the SIC target. CEst is studied in Section IV, while simulation results are demonstrated in Section V. Finally, in Section VI, we summarize the findings from research and consider their implications.
Throughout the paper, the following notations are used: A A A, a a a and a are for matrix, vector, and scalar, respectively; A represents set; a a a(i) denotes the i-th element of a a a, and A A A(i, j) denotes the (i, j)-th element of A A A; |A A A|, A A A * , A A A T , A A A −1 and A A A H represent, respectively, the determinant, complex conjugate, transpose, inverse and Hermitian transpose of A A A; diag(a, b, . . .) means a diagonal matrix with the diagonal entries of (a, b, . . .), diag(a a a) means a diagonal matrix formed from a a a, and I I I N denotes a (N × N ) identity matrix; CN (0 0 0, A A A) represents the zero-mean complex Gaussian distribution with covariance matrix A A A; Furthermore, Tr(·), log(·) and E[·] denote the trace, logarithmic and expectation operators, respectively.

II. SYSTEM MODEL
We consider a MDD/MU-MIMO system, where a N -element transmit antenna array at BS uses N RF radio-frequency (RF) chains to serve D DL mobile stations (MSs) and simultaneously, aN -element receive antenna array also at BS uses N RF RF chains to serveD UL MSs. Hence, the total number of MSs is D sum = D +D. All MSs are equipped with single antenna. In our proposed system, the BS works in MDD mode, while MSs are operated in HD mode. As shown in Fig. 1, we assume that the transmitter and receiver at BS are reasonably separated in space, and both of them are equipped with the uniformly spaced linear antenna array (ULA). We assume OFDM-assisted data transmission, that the channels between BS and MSs experience frequency selective fading, and that sufficient cyclic prefix (CP) is invoked to avoid inter-symbol interference (ISI). Furthermore, following the principles of MDD [12], subcarriers are divided into two mutually exclusive subsets, namely a DL subcarrier subset M with M subcarriers, and an UL subcarrier subsetM withM subcarriers, i.e., |M| = M and |M| =M . The total number of subcarriers is expressed as M sum = M +M . We assume that UL/DL MSs are scheduled in such a way, so that the interference generated by UL MSs on a DL MS is sufficiently low, without distorting the operation of the DL MSs' receiver ADCs.

A. COMMUNICATIONS CHANNEL MODEL
The channel between the n-th transmit antenna at BS and the d-th DL MS is modeled by a L-tap frequency-selective fading channel, with the time-domain CIR (TDCIR) expressed as [24] g g g DL n,d = g n,d [1], . . . , g n,d [l], . . . , g n, where g DL n,d [l] = α DL n,d,l follows a complex Gaussian distribution of α DL n,d,l ∼ CN (0, 1/L). Similarly, the UL channel between thed-th UL MS and then-th receive antenna at BS is defined as the L-tap TDCIR expressed as g g g UL n,d . Furthermore, when the same frequency band is considered, which is the case in MDD mode, we have g g g DL n,d = g g g UL n,d , meaning that there is no distinction between UL and DL channels, i.e., they are reciprocal. Hence, when there is no confusion, the notations 'DL' and 'UL' are removed.
According to the principles of OFDM [10], the frequency-domain CIR (FDCIR) h h h n,d can be obtained as where where T DL ∈ C M ×M sum and T UL ∈ CM ×M sum are the mapping matrices, constructed from I I I M sum by choosing the columns corresponding to the particular subcarriers assigned to DL and UL, respectively.

B. SELF-INTERFERENCE CHANNEL MODEL
Since both UL and DL are operated in the same frequency band based on MDD, the MDD/MU-MIMO system experiences self-interference (SI), as shown in Fig.1. We assume that the SI channel experiences Rician fading, constituting both line-of-sight (LOS) component and non-line-of-sight (NLOS) component, which is expressed as [23] where κ is the Rician factor. As the transmit and receiver antennas at BS are close to each other, H H H LOS SI denotes the LOS near-field flat fading channel, with the (i, j)-th element expressed as [25] H H H LOS where ρ is the power normalization constant making

C. REQUIREMENT OF SELF-INTERFERENCE CANCELLATION
In full-duplex systems, ADC is the most critical component determining the system operability and achievable performance. A practical ADC has only limited dynamic range and resolution. Hence, if the input signal to an ADC is beyond a particular level, the signal will be distorted, yielding large quantization noise and non-linear distortion, which would further decrease the performance of the following digital-domain signal processing [26]. Specifically, when assuming a Q-bit ADC, the signal to quantization noise ratio (SQNR) is about 6.02Q [27]. When given the bandwidth of B (Hz), the noise floor at receiver is given by [5] −174 + k N + 10log(B) dBm, where k N is noise factor. Hence, the maximum input signal power to the receiver is Therefore, for a SI contaminated signal to pass an ADC without distortion, the propagation-and analog-domain SI cancellation should provide the SIC of at least [26] C Demand where P DL is the transmit power of DL, while 10 dB is added to account for the PAPR, as an OFDM signal's power may rise upto 10 dB above the average power [28]. After the propagation-and analog-domain SIC, the SI input to ADC of BS receiver is P SI = P DL − C SI , where C SI is the total SI reduced in the propagation-and analog-domains. Hence with the aid of (8), we know that when given P DL , the SIC should satisfy the SIC requirement of Above we have provided the channel models in the MDD/MU-MIMO systems, and analyzed the target for SI cancellation. Below we start considering the transceiver design for the MDD/MU-MIMO systems.

D. TRANSMITTER MODEL
For DL transmission, let the symbol vector transmitted by BS on the DL subcarriers be expressed as In order to mitigate SI and attain beamforming gain, at transmitter, where P m is the maximum transmit power of the m-th DL subcarrier. The total transmit power of DL satisfies M m=1 P m ≤ P DL . As shown in Fig. 1, the transmitter precoder constitutes a digital precoder F F F BB [m] for each individual DL subcarrier and an analog precoder F F F RF that is common to all DL subcarriers. Hence, the overall precoding for a DL subcarrier can be expressed as The baseband signal transmitted on the m-th subcarrier can be expressed as where s s s[m] ∈ C N ×1 .

E. RECEIVER MODEL
When given the transmitted signal as shown in (10), the received signals by the D MSs from the m-th subcarrier can be expressed as where are the DL channel matrix and additive Gaussian noise corresponding to the m-th DL subcarrier, respectively. It is noteworthy that for simplicity we ignore the interference from UL MSs to DL MSs in (11), so that we can focus on the SIC and channel estimation in MDD/MU-MIMO systems. 1 On the other side, the signals transmitted at BS also propagate to its receive antenna array for the UL. This SI signal at m-th subcarrier after the analog combining can be expressed as In (12), W W W RF ∈ CN ×N RF is the analog combiner, which is regarded as the pseudo-identity matrix in digital beamforming system, i.e. W W W RF = I I IN ×N RF [32]. Based on (12), the total SI power entering the ADC in the BS receiver is given by 2 2 . Note that the distance between the transmitter and receiver arrays at BS is much smaller than the communication links from MSs to BS, which leads to SI signals to be 50-100 dB stronger than the desired signals received from UL. This means that although DL and UL are operated on different subcarriers, prior to digitization, UL signals would be overwhelmed by SI signals in ADC, if propagation-and analog-domain SIC cannot provide sufficient SI reduction. In this case, quantization noise may be significant and unable to be mitigated by any digital-domain signal processing techniques. Therefore, a certain amount of SI reduction has to be achieved to satisfy (9) prior to the ADC at receiver. On the other side, provided that the constraint of (9) is satisfied, as shown in Fig. 1, the received digital signals after RF processing and ADC can be expressed as (13) where

III. ADAPTIVE BEAMFORMING-AIDED SELF-INTERFERENCE CANCELLATION
In this section, we address the beamforming-aided SIC implemented via the design of hybrid precoder/combiner. The objective of SIC is to make the SIC requirement of (9) be satisfied.
According to (12), the power of SI signals before ADC can be evaluated as The first design option assumes that SI suppression is solely depended on the design of F F F RF at transmitter. Thereby, the design of combiner at receiver only focuses on the UL transmissions without considering the impact of SI. In this paper, we consider the MMSE method for UL combining [33], and the full-digital combiner can be expressed as [34] W The optimization problem of (17) is a typical one in the design of hybrid beamforming, which can be solved by different algorithms, such as that in [35]- [39]. Specifically in the performance study in Section V of this paper, we introduce the projected gradient descent (PGD) algorithm [38], [39]. Readers interesting in the details of the algorithm are referred to these reference. After W W W RF is obtained, the SI suppression can be executed based on the optimization of arg min However, this optimization problem is non-convex and hard to solve. To circumvent this dilemma and reduce the computational complexity, we propose an adaptive algorithm based on the cyclic coordinate descent (CDC) algorithm [40]- [43], so as to dynamically suppress SI. It is noteworthy that the performance of the CDC algorithm is sensitive to the initialization [44]. Hence, in order to suppress SI while simultaneously maintain the required performance of DL, the initialization of F F F RF is very important in our algorithm. In Section V-A, we will investigate the effect of the initialization of F F F RF on the achievable performance. Furthermore, to calculate P SI during the optimization process, the digital precoder for the m-th subcarrier is assumed to be in ZF principle, which yields is obtained from the water-filling algorithm [45].
To solve the optimization problem (18) iteratively, the element F F F RF (i, j) is firstly optimized by assuming that all the other elements are fixed. In this case, (18a) can be simplified to where Since all the elements in F F F RF other than F F F RF (i, j) are fixed, A A A j , ζH ij and ηH ij seen in (20) are complex constants. Furthermore, under the modulus constraint of the analog precoder, i.e., F F F RF (i, j) = e −jθ ij , (20) can be re-stated as Now the optimization is converted to an extreme-value problem, which can be readily solved. In detail, upon taking the derivative of g(θ ij ) with respect to θ ij , we obtain ∂g(θ ij ) ∂θ ij = jηH ij e jθ ij − j(ηH ij ) * e −jθ ij = 0, which is equivalent to Let us represent it in the form of where Solving (24) under the constraint of θ ij ∈ (0, 2π), we obtain However, there is only one solution yielding a minimum value of g(θ ij ), i.e., the optimum solution. Hence, the final solution to θ ij is given by ij ) The above optimization process is repeated with respect to each of the elements in F F F RF , and the elements of F F F RF are iteratively optimized until the cost function converges to a local minimum. Hence, before reaching the minimum, SI can be gradually suppressed with the increase of the number of iterations. A shortcoming of the CDC algorithm is that convergence is usually slow and dependent on the cost function of (18a), which in turn related to the SI channel and antenna configurations [27]. Nevertheless, once the condition of (18c) is satisfied, meaning that the SI reduction provided by beamforming is sufficient to make ADC work efficiently, more iterations for further SI reduction is no longer necessary. Hence, once the constraint of (18c) is met, the process of SI suppression can be terminated to save time. In Section V-A, we will demonstrate the convergence performance of the CDC algorithm.
In summary, the first design option is stated as Algorithm 1. For UL reception, when given W W W MMSE [m], the hybrid combiner is obtained by the PGD algorithm. For SI suppression, after the initialization of F F F RF , the analog precoder is iteratively updated to reduce the SI power based on the CDC algorithm, until the SIC meets the requirement. During every iteration, the digital precoder F F F BB [m] is derived based on ZF method and water-filling algorithm.

B. COMBINING OPTIMIZATION AIDED SELF-INTERFERENCE CANCELLATION
In the context of the second design option, we assume that N > N . In this scenario, the analog precoder F F F RF is derived via maximizing the DL spectral efficiency without considering the effect of the SI on the UL receiving. Instead, SI suppression is only attempted by the design of W W W RF . Therefore, the design of the hybrid combiner in Option 2 is similar to the design of the hybrid precoder in Option 1, except that there is no power allocation in combiner's design. Furthermore, the analog and digital precoders in Option 2 can be designed by employing the PGD algorithm, when an overall precoder in the form of (19) is prepared.
It can be argued that the design in Option 2 has lower complexity than that in Option 1. The reason is that in Option 1, the digital precoder and analog precoder need to be iteratively updated, so that the SI on the UL receiving is gradually reduced to an allowed value. By contrast, in Option 2, once the hybrid DL precoder is obtained, the SI on UL receiver becomes stable. Hence, no iteration is required between the design of the digital combiner and that of the analog combiner. In other words, the analog combiner can be firstly designed to suppress SI to a sufficient level. Then, digital combiner can be derived for a fixed analog combiner. In summary, the design in Option 2 is stated as Algorithm 2.

Algorithm 2 Combining Optimization Aided SIC (Option 2)
Require: P m , P DL , C Demand It is worth noting that following Options 1 and 2, there is a third option for the design, which optimizes W W W RF and F F F RF jointly. However, it can be shown that the SIC performance is mainly determined by the DL transmitter or UL receiver, depending on which of them has more antenna elements. As demonstrated in Section V, the side (either DL transmitter or UL receiver) with less antenna elements can hardly provide any gain for SIC. Owing to this, the third design option is not further considered in this paper.

IV. CHANNEL ESTIMATION IN MDD/MU-MIMO SYSTEMS
As argued in Section I of introduction, in MDD/MU-MIMO systems, the DL channels can be estimated based on the observations received from the UL channels by exploiting the reciprocity existing between the DL and UL subchannels, which is generated by the frequency-domain correlated fading. In this section, we consider the CEst in MDD/MU-MIMO systems. We first consider the CEst based on orthogonal transmission and focus on the design of frequency-domain pilot symbol (PS) vectors and the associated conditions. Then, the CEst in the scenario of non-orthogonal transmission is considered. Note that, below we only consider the CEst of communication channels. The SI channel can be estimated by the various approaches proposed in references, e.g., in [46]- [49].
By observing (2), (3) and (4), we can know that the CEst can be initialized with the UL training in frequency-domain. Using the frequency-domain training, the time-domain channel g g g n,d can be estimated. Then, from g g g n,d both the DL and UL channels of the subcarriers can be obtained with the aid of (3) and (4).
Let us assume that all MSs synchronously transmit their PSs. For CEst, we assume that each MS transmits PSs on all the UL subcarriers. The fading of channels is assumed to be slow enough for making use of the reciprocity for UL/DL processing. Then, consider that the PSs transmitted by the where X X X d = diag{x x x d }, z z z n ∼ CN (0 0 0, σ 2 I I IM ), h h h UL n,d is given by (4), and with the aid of (2), i.e., h h h UL n,d = T UL F F FΨ Ψ Ψ g g g n,d , y y y n is directly expressed in terms of the TDCIR g g g n,d .
From (28), we can see that if the PSs can be designed to satisfy where is a constant, then the TDCIR g g g n,i from MS i can be readily estimated by the LS method, given bŷ  (30) are referred to as the orthogonality-achieving PSs. With the aid of the approach proposed in [50], it can be shown that ifM ≥ LD sum and theM UL subcarriers are evenly distributed among the M sum subcarriers, the set of PSs given by (31) where ξ = M D sum , are orthogonality-achieving PSs. Hence, we have the following Proposition. Proof: See Appendix. However, if the conditions stated in Proposition 1 are not satisfied, or more random PSs are used, orthogonalityachieving PSs may not be available. In this case, we can write (28) as y y y n = Q Q Q i g g g n,i + T T T i + z z z n (32) where by definition, is the interference signal from other MSs. In order to suppress the interference from the other MSs, let us introduce the LMMSE estimator for CEst. Then, the estimate to g g g n,i in (28) (33) Assume that the TDCIRs from MSs are uncorrelated, i.e., E[g g g n,d g g g H n,d ] = 0 0 0, ∀d = d , which is usually satisfied as MSs are in general well separated in space. Then, the solution to (33) is where

E[T T T i T T T H
Correspondingly, the estimateĝ g g n,i is given bŷ y y y n (35) It is well-known that LMMSE yields biased estimation. To attain an unbiased estimator, we can form the estimate aš Again, after obtaining the estimation ofǧ g g n,i , if MS i is a UL user and Antenna n is the UL receive antenna at BS, BS uses (4) to obtain the frequency-domain channel for UL detection. By contrast, if MS i is a DL user and Antenna n is the DL transmit antenna at BS, BS uses (3) to obtain the frequency-domain channel for DL precoding.

V. PERFORMANCE RESULTS
In this section, we first evaluate the SIC performance. Then, under the assumption of ideal CEst, the performance results for the MDD/MU-MIMO systems are depicted and discussed, when the precoder and combiner designed in Section III are employed. Then, the performance of the CEst method introduced in Section IV is investigated. Finally, we present the simulation results for the MDD/MU-MIMO systems when the LS and LMMSE CEsts are employed.
In our simulations, we assume the channel model as presented in Section II-A, and the ULA at BS with the half-wavelength spacing between two adjacent antenna elements. The distance r ij between the i-th element of transmitter and the j-th element of receiver is set according to [21] and the default angle between the transmitter array and receiver array is ϕ = 120 • . We further assume that the number of CIR taps for communications channel is L = 4. For the SI channel, we set κ = 20 dB as a default value.

A. SELF-INTERFERENCE CANCELLATION
According to [51], we assume that the transmit power of BS and the signaling bandwidth are P DL = 30 dBm and B = 20 MHz, respectively. The total transmit power is uniformly allocated to the DL subcarriers. We further assume the 12-bit ADCs used by the UL receiver at BS. Then, the maximum input power to the UL receiver and the demand of SIC can be found from (7) and (8), which are s max ≈ −25 dBm, C Demand SI = 65 dB, respectively. In other words, the system needs to achieve at least 65 dB of SI reduction, so that the UL receiver at BS can work efficiently.
In the first experiment, we demonstrate the SIC performance in the MDD/MU-MIMO systems with the transceivers designed under Option 1 of Section III. In this study, we assume that the number of antennas and RF chains at UL receiver areN = 32 andN RF = 8, respectively. The other parameters are detailed under the caption of Fig. 2. In this figure, we compare the SIC performance of the proposed iterative coordinate descent algorithm, when the analog precoder is either randomly initialized or initialized via optimizing the DL sum-rate as in equation (19) in [40], which are referred to as the 'Random initial' and 'Optimized initial' analog precoders, respectively, in the figure. Note that the random initial analog precoder is obtained by firstly extracting the angle information from the null space matrix of H H H SI , i.e., from V V V (N −rank(H H H SI )):end , and then, N RF columns of V V V (N −rank(H H H SI )):end are randomly selected to construct F RF . Note furthermore that the SIC behavior from the 1st to the 20th iterations are separately depicted in Fig. 2(a) to highlight the relatively sharp changes. As shown in Fig. 2, explicitly, the SI reduces with the increase of the number of iterations. Specifically, when BS employs N = 128 DL transmit antennas and N RF = 16 DL RF chains, 65 dB of SIC reduction can be achieved after 80 iterations, when either the random initial or optimized initial analog precoder is employed. The performance achieved by the random initial analog precoder and optimized initial analog precoder is very similar. Fig. 2 illustrates that employing more DL transmit antennas allows more SI reduction for a given number of iterations. For N RF = 16 and when the number of DL transmit antennas is decreased from 128 to 64, the SI reduction capability is reduced from about 65 dB to about 25 dB after 80 iterations. Furthermore, Fig. 2 shows that when the number of RF chains is reduced from 32 to 16 while keeping the number of DL transmit antennas fixed, more than 2.5 dB of SI reduction can be obtained. Note that, the above observations can be similarly obtained from the systems operated under Option 2, when the above-stated DL transmit antennas is changed to the UL receive antennas.
The second experiment considers the effect of the Rician factor κ and the arrays angle ϕ on the performance of SIC. Again, we assume that the system is operated under Option 1 with the parameters as shown in the caption of Fig. 3, where the range from the 1st to the 20th iterations are separately depicted in Fig. 3(a) for highlighting the behavior. It can be observed that the Rician factor and arrays angle yield big impacts on the SIC performance. Specifically in terms of the Rician factor, it can be seen that when the LOS component becomes more dominant, the proposed SIC method becomes less efficient. However, we should note that while the beamforming methods work inefficiently with the LOS component, other easy-implementing approaches, such as, of adding blockage between transmitter and receiver [6], [52]- [55], may be employed to significantly mitigate the LoS SI. In terms of the angle between the transmit and receive antenna arrays, Fig. 3 shows that the narrower angel is set, the more SI can be reduced. This is because when the angle is narrower, the SI power imposed by the SI signal of a given transmit array element on all the receiver array elements is nearly same, which is beneficial to SIC by using beamforming technique. By contrast, if the angle between transmit/receive arrays is wide, the distances from a given transmit array element to all the receive antenna array elements can be very different. Hence, the SI power from a given transmit array element to the receive array elements is very different. Consequently, it is difficult for the beamformer to simultaneously suppress them efficiently.
Additionally, from Figs. 2 and 3 we can be implied that the SI reducing rate and the SIC potential provided by the proposed CDC algorithm are dependent on the antenna configuration and SI channel's characteristics. For instance, if the SI channel only has NLOS components, the CDC algorithm makes the SI approach a fixed value after about 10 iterations, yielding a SI reduction of about 300 dB, which is much larger than C Demand SI of required. As shown in Figs. 2 and 3, in some cases, the SI reducing rate is relatively small, but the CDC algorithm can still allow to achieve the required SIC. For example, when the Rician factor is 100, the algorithm is able to provide about 65 dB SI reduction after about 80 iterations. In some other cases, such as, when the Rician factor is 10 5 , the SI reducing rate is very small and the SIC requirement of (18b) is hard to meet, even after many iterations. However, it is worth noting that in this case, the LoS propagation is dominant and the SI can be efficiently suppressed by the other approaches of, such as, using blockage. In summary, from the results of Figs. 2 and 3 we can know that to attain a good performance of SI reduction, we can increase the number of antennas at the side implementing SIC and/or reduce the angle between TX/RX antenna arrays. Finally, we compare the SIC performance of the MDD/MU-MIMO systems with the transceivers designed by Option 1 and Option 2, respectively, in Fig. 4. The beamforming based SIC algorithm presented in [23] is shown as the benchmark. In this study, we set the parameters to N = 128,N = 32, N RF = 16 andN RF = 8 for Option 1, for Option 2 in the case of N ≥N and also for the algorithm presented in [23]. For the Option 2 in the case of N <N , we set N = 32,N = 128, N RF = 8, andN RF = 16. From Fig. 4 we observe that Option 1 and Option 2 are equally efficient for SI mitigation, when the same number of antennas used for SI suppression is assumed. Moreover, as shown in Fig. 4, in the case of N ≥N , if Option 2 is employed to mitigate SI, i.e., SI is suppressed by receive antenna array, the SIC gain is very limited. Therefore, when given the deployment of N ≥N or N ≤N , there is not much meaning to implement the joint transmit/receive beamforming for SI mitigation. This is because in contrast to using either transmit beamforming in the case of N ≥N or receive beamforming under N ≤N , the SIC gain provided by the joint transmit/receive beamforming is marginal, while the increase of implementation complexity is significant. Additionally, when comparing the SIC Option 1 with Fig. 2(b) in the case of (1): N = 128, N RF = 16, both cases use the same parameters, we can see that the proposed algorithm is capable of providing 65 dB of required SI reduction after 80 iterations. By contrast, although the SIC algorithm presented in [23] can provide 30 dB SI reduction after 5 iterations, it however gets saturated at this value and is unable to achieve the required SI reduction in analog-domain, no matter how many iterations of the algorithm are executed. In Table 2, we summarize the comparison between the proposed SIC methods and the method presented in [23] in terms of their complexity and SIC capability. The complexity shown in Table 2 includes both the complexity for SI reduction and that for digital precoder/combiner, when the number of antenna elements and that of RF chains at transmitter or receiver are given. First, regarding to the SIC performance, as we stated in Section III-B, both Option 1 and Option 2 (N ≤N ) have the highest SIC capability. However, as the SIC is independent of the design of precoder, Option 2 (N ≤N ) demands lower computational complexity than Option 1. As shown in Table 2, Option 2 (N ≥N ) also has low computational complexity, but it is unable to provide sufficient SI suppression due to the constraint on antenna arrays, as shown in Fig 4. As for the method presented in [23], it can provide upto 30 dB SI reduction after about 5 iterations, but no further SI reduction is available, no matter how many iterations are executed. Hence, when taking into account the SIC capability and the required complexity, as shown in Table 2, Option 2 with N ≤N constitutes the most desirable SIC method.

B. PERFORMANCE OF HYBRID MDD/MU-MIMO SYSTEMS WITH SI CANCELLATION
We now demonstrate the achievable performance of the hybrid MDD/MU-MIMO systems with SI Cancellation. For this purpose, we consider a MDD/MU-MIMO system, where BS employs N = 128 transmit antennas and N RF = 16 DL RF chains to support D = 6 DL MSs, as well asN = 32 receive antennas andN RF = 8 UL RF chains to serveD = 4 UL MSs. Unless explicitly noted, the transmit power of BS and MS is set to P DL = 30 dBm and P MS = 20 dBm, respectively [51], [56]. Furthermore, at BS, the transmit power is evenly assigned to the M DL subcarriers, while the transmit power of a DL subcarrier is assigned to the D number of DL MSs based on the water-filling principle. The total bandwidth is assumed to be 20 MHz and the number of DL and UL subcarriers are M = 64 andM = 32, respectively. The MSs are assumed to be uniformly distributed within a circular area of radius R = 60 m. Furthermore, the pathloss for a MS with distance d from BS is modeled as P L (dB) = 72+29.2 log 10 (d). Additionally, the power spectral density of noise is −173 dBm/Hz. In the following figures, the average sum rate denotes the total rate of a system, including both DL and UL, which is formulated as Note that the comprehensive comparison of MDD with TDD/IBFD in mmWave environment can be found in [12]. More general comparison of MDD with FDD/TDD can be found in [11]. Readers interested in the more details about the comparison are referred to these references.   Fig. 5, we observe that the ideal full-digital MDD/MU-MIMO system without SI provides the upper-bound performance. Note that, here 'without SI' means that the required SIC can be achieved by the easy-implementing SIC approach. However, when hybrid beamforming assisted SIC is considered, as Fig. 5 shows, using different initial analog precoders under Option 1 yield a big impact on the achievable sum-rate. The optimized initial analog precoder achieves much higher sum-rate than the random initial analog precoder. By contrast, when operated under Option 2, the optimized initial analog combiner can only achieve the similar performance of the Option 1 with random initial analog precoder. The reason behind is that in this study, we assumedN N , and in this case, as argued in Section III, the performance of SIC should be dominated by the transmitter-based SIC, i.e., Option 1 design.  For this figure, we assume that the remaining SI not suppressed by the analog precoder is ideally suppressed by the other easy-implementing approach. Therefore, as shown in Fig. 6, the highest sum-rate is observed before the analog precoder starts operating. When the analog precoder is operated with more iterations, the achieved sum-rate reduces, as the result that more degrees-of-freedom provided by transmit antennas are used for SIC. However, the achievable sum-rate becomes steady after only about 3 or 4 iterations, yielding the sum-rate cost for SIC. By contrast, as shown in Fig. 2, the amount of SI compressed monotonically increases with the increase of the number of iterations. The impact of the number of DL transmit antennas N and the number of DL RF chains N RF on the sum-rate performance of MDD/MU-MIMO systems is shown in Fig. 7, when assuming that SI is suppressed using the Option 1 method. Explicitly, for a given number of DL transmit antennas, the achievable sum-rate increases, as the number of RF chains increases, which is at the cost of the increase of implementation complexity. By contrast, when N RF is fixed, the sum-rate increases, as N increases, owing to the improved SI capability.  As each UL MS has fixed transmit power, while the total BS transmit power is shared by all DL MSs, the total throughput of MDD/MU-MIMO system is dominated by the UL, when the transmit power of BS is relatively low. In this case, when the UL employs more subcarriers, the total throughput of MDD/MU-MIMO system is higher. By contrast, when BS's transmit power is sufficiently high, the system's throughput will become DL dominant. Correspondingly, employing more DL subcarriers provides higher throughput by the MDD/MU-MIMO system. By contrast, in Fig. 9, we investigate the effect of the numbers of DL/UL MSs, i.e., D andD, on the sum-rate performance of MDD/MU-MIMO systems, while the other parameters are set to default values. As seen in Fig. 9, first, supporting more DL or/and UL MSs in general improves the total throughput of MDD/MU-MIMO systems. Second, when D is fixed, the total throughput of MDD/MU-MIMO systems increases, as the value ofD increases. Finally, in the case thatD is fixed, we observe that the total throughput achieved by D = 12 is slightly lower than that obtained by D = 6, when BS's transmit power is relatively low. However, when BS's transmit power is relatively high, the observation reverses, i.e., the total throughput achieved by D = 12 is higher than that obtained by D = 6. The reason behind this observation is that when at relatively low transmit power, the systems' total throughput is nominated by the UL MSs. By contrast, at relatively high transmit power, the systems' total throughput is nominated by the DL MSs, owing to the contribution from the joint power-allocation among the DL MSs.   Fig. 10, when the optimized initial analog precoder is employed, the achieved sum-rate reduces with the increase of L. This is because when L increases, the fading experienced by different subcarriers becomes more random, making the optimized initial analog precoder appear more like the random initial analog precoder. By contrast, when the random initial analog precoder is employed, the achieved sum-rate is very similar, regardless of the values of L. Hence, while the random initial analog precoder usually achieves VOLUME 8, 2020 lower sum-rate than the optimized initial analog precoder, it is more robust to the time-varying of channels.

C. CHANNEL ESTIMATION
In order to investigate the performance of CEst, we consider a MDD/MU-MIMO system where a BS employs a 128 × 32 antenna array, and the number of DL and UL subcarriers are M = 128 andM = 64, respectively. Specifically in Fig. 11, we compare the MSE performance of CEst in three cases. In the first case, the LS CEst with orthogonality-achieving PSs, i.e., with the settings satisfying Proposition 1 is considered. In the second case, we also assume the LS CEst but with the UL subcarriers randomly selected. Therefore, the orthogonality-achieving PSs cannot be guaranteed. Finally, in the case of LMMSE CEst, the UL subcarriers are also randomly distributed. From Fig. 11 we observe that the LS CEst with orthogonality-achieving PSs achieves better MSE performance than the other two CEsts, while the LS CEst with randomly selected UL subcarriers achieves the worst MSE performance. Furthermore, the LS CEst with random UL subcarriers experiences the interference, which is unable to be suppressed by the method, hence yielding MSE floor. By contrast, the LMMSE CEst is capable of efficiently suppressing the interference and removing the MSE floor. We should note that, although the LS CEst with orthogonality-achieving PSs outperforms the LMMSE CEst in terms of MSE performance, it has the disadvantages of, such as, low number of orthogonality-achieving PSs due to the constraint of Proposition 1.
Finally, in Fig. 12 we compare the achievable sum-rate of MDD/MU-MIMO systems, when the channel knowledge is obtained by the LS CEst with orthogonality-achieving PSs or the LMMSE CEst with random UL subcarriers. Furthermore, the case of ideal CEst is included as a benchmark. Explicitly, both CEst schemes work efficiently over the SNR range considered. The sum-rate gap between that achieved by assuming ideal channel knowledge and that achieved by practical CEst is marginal. When comparing the LS and LMMSE methods, we find that the LS CEst allows to transmit an extra of 2.5 bits/s/Hz beyond the sum-rate achieved by the LMMSE CEst. However, we should remember the LS CEst with orthogonality-achieving PSs is limited by Proposition 1.

VI. CONCLUSION
In this paper, we have investigated the adaptive SIC methods based on the hybrid beamforming design of both transmitter and receiver as well as the CEst in MDD/MU-MIMO systems. We have first highlighted the design of the hybrid transmitter precoder for DL transmission and the hybrid receiver combiner for UL signal detection, both of which can simultaneously suppress SI to a desired level. Our studies show that SI can be dynamically mitigated by employing either the analog transmitter precoding or the analog receiver combining. Furthermore, it is shown that SIC should be implemented at BS transmitter or at BS receiver depending on which of them has more antenna elements. Then, the CEst in MDD/MU-MIMO systems has been designed by exploiting the reciprocality existing between UL/DL channels resulted from the correlated fading of subcarriers. Our studies reveal that first, when the number of MSs is relatively small and the PSs are evenly arranged across the subcarriers, the design of orthogonality-achieving PSs is available. In this case, a LS CEst achieves optimum performance. However, when randomly distributed PSs have to be used due to, such as, a big number of MSs, the performance of LS CEst degrades significantly. Instead, a LMMSE CEst is near-optimum, and is capable of significantly improving the performance over the LS CEst. Finally, the performance of the MDD/MU-MIMO systems with our proposed hybrid beamforming SIC method has been investigated, when CSI is provided by our proposed CEst, showing that the achievable performance can be close to that with ideal CSI.
MDD provides an opportunity for UL/DL channels to jointly share the frequency-domain resources. Our future work will consider the hybrid transceiver design in conjunction with joint UL/DL resource allocation in MDD/ MU-MIMO systems.
where 0 0 0 L is a (L × L) all-zero matrix, and =M /M sum is constant.
Since the UL subcarriers are uniformly-spaced with a spacing of l number of subcarriers between two adjacent UL subcarriers, it can be shown that the matrix T UL F F FΨ Ψ Ψ is given by wherem 1 ,m 2 , · · · ,mM are the indices of UL subcarriers. Note that, to obtain (39), the relationships ofm 2 −m 1 = . . . = mM −mM −1 = l are used. Let P P P = P P P[m] H P P P[n] = (X X X m G G G UL ) H (X X X n G G G UL ) , 1 ≤ m, n ≤ D sum (40) Then, upon applying (31) and with the aid of (39), we can show that the (u, v)-th element ofP P P is By contrast, in the case of m = n, sinceP P P is a (L × L) matrix, we have −(L − 1) ≤ uv ≤ L − 1. Hence, uv − (m − n)ξ = 0, provided that ξ ≥ L. Therefore, we have P P P[m] H P P P[n] = 0 0 0 L for all m = n, provided that ξ ≥ L.
In summary, when given ξ ≥ L and that the UL subcarriers are uniformly arranged with a constant spacing of l between two adjacent UL subcarriers, the set of PSs given by (31) satisfy (38), i.e., they are the orthogonality-achieving PSs.