Beam Training Technique for Millimeter-Wave Cellular Systems Using Retrodirective Arrays

Beam training in millimeter-wave (mmWave) cellular systems requires a long processing time that is proportional to the product of the number of transmitting and receiving beams. In this paper, we propose a beam training technique that can significantly reduce the beam training time in mmWave cellular systems, using a retrodirective directional array (RDA). In the proposed technique, the beam sweeping operations required at the base station (BS) and mobile station (MS) are significantly reduced owing to the use of the RDA, which automatically returns a signal in the direction along which it originated. A preamble sequence design technique for beam training is proposed to identify the BS, MS, and beams simultaneously transmitted from the BS/MS, using the Zadoff–Chu sequence. The ambiguity condition and detection algorithms are derived so that we can uniquely identify the parameters for beam alignment in asynchronous environments with symbol timing offset (STO) and carrier frequency offset (CFO). Simulations show that the proposed algorithm can correctly detect the parameters for beam alignment in mmWave cellular systems with RDA in asynchronous environments. Moreover, the proposed technique can significantly reduce the period required for beam training, compared with the conventional techniques.


I. INTRODUCTION
The widespread use of smartphones for various applications and services has caused a rapid increase in mobile data traffic. Millimeter-wave (mmWave) communication is a promising technology that allows the use of a wide bandwidth to support the high data rates required for the next-generation outdoor cellular systems [1], [2]. However, mmWave signals experience severe path losses, when compared to lower-frequency band signals. To compensate for this large path loss and to extend the transmission range in the mmWave frequency band, highly directional beamforming antennas are necessary at both base station (BS) and mobile station (MS) [3]- [5]. A decrease in wavelength enables a large number of antenna arrays to be packed into small form factors, making it feasible to realize the large arrays required for achieving high beamforming gains.
The associate editor coordinating the review of this manuscript and approving it for publication was Haipeng Yao .
In mmWave communication systems, a misalignment between the transmitting (Tx) and receiving (Rx) beams may cause a significant loss of received power, particularly in systems with narrow beams. Therefore, beam training is necessary in mmWave communication systems to find the best beam pair among all possible beam pairs, to maximize the beamforming efficiency. Beam training is necessary in the initialization stage as well as in situations where behavioral changes of MS (rotation, displacement) or environmental changes (blockage) occur. The beam training needs to be performed much more frequently than in conventional communication systems because mmWave links are vulnerable to blocking and falling out of beam alignment. For this reason, various techniques for beam training have been proposed for mmWave communication systems [4]. The beam training techniques can be largely divided into two different categories: exhaustive search technique and iterative search technique. In the exhaustive search technique, single Tx beams are individually transmitted from the BS until all of the Tx beams are transmitted. The Rx beam sweep is performed at the MS for each Tx beam to measure the signal-to-noise ratios (SNRs) for every Tx-Rx beam pair. The measurement of SNRs for all possible Tx-Rx beam pairs must be performed for all neighbor BSs to select a serving BS with the best beam pair, as in the 5G new radio (NR) [2]- [6]. The processing time required for beam training in the exhaustive search technique increases proportionally to the product of the number of Tx beams and number of Rx beams. This long processing time will create significant overhead for a moving MS because beam training should be performed periodically for a possible handover or beam tracking. In the iterative search technique, a hierarchical multi-resolution codebook is used to construct training beamforming vectors with different beamwidths [4], [5]. In this technique, the beamwidths are reduced iteratively as the beamforming vectors of the next level have higher resolutions, thereby reducing the beam training overhead. Recently, beam training techniques for mmWave communication systems with hybrid beamforming architectures were proposed using subarray structures [7]- [9].
In this paper, we propose a new beam training technique that can significantly reduce the beam training time in mmWave cellular systems, using a retrodirective directional array (RDA). The RDA is of growing interest due to its unique functionality and relative simplicity in comparison to the approaches used in phased arrays and smart antennas. The RDA has the characteristic of reflecting an incident wave toward the source direction without any prior information on the source location. The analog self-phasing function in these arrays makes them good candidates for possible wireless communication scenarios where high link gain and high-speed target tracking is desired. Conventional phased-array antennas are able to steer their beams by exciting elements with phase shifters. In contrast, RDAs steer their beams automatically without any computationally intensive algorithms or hardware-based phase shifters in response to an incident signal. More traditional beamformers using relatively complex algorithms to determine antenna patterns are slower and more expensive to implement than the RDA. Compared to smart antennas that rely on digital signal processing for beam control, the RDA is much simpler and potentially faster because digital computations are not needed. As the RDA has a much faster target acquisition rate, when compared with that of the conventional radar, it can track a fast-moving target effectively [10]. In addition to its ability of retransmitting the received signal without prior information on the incoming signal, the RDA may have the characteristics of high link gain and low cost, which make it useful in many applications including radio-frequency identification (RFID) [11]; satellite communication [12]; power transmission [13], [14]; and secure communication [15], [16]. To perform phase conjugation in RDA, several techniques such as the Van Atta array and Pon array have been considered. These methods use either the transmission lines between antenna elements or mixers to configure heterodyne structures. Furthermore, recent techniques have shown the use of digital domains or phased-locked loops (PLL) to resolve the issues of low phase accuracy, high conversion loss, and difficulty in full-duplex communication [17].
However, to the best of the authors' knowledge, there has been no attempt to use the RDA for reducing the beam training overhead in mmWave cellular systems. We expect the beam training overhead to be reduced significantly if the RDA is applied to the MS in mmWave cellular systems. As the RDA can automatically return a signal in the direction of its source, it can be used to detect the existence of MSs and find optimal beam pairs. To use the RDA in the beam training period, it should have the ability to provide its identity in the retransmitted signal, as the receiver should be able to identify the source of the signal from the received signal. In cellular systems, the BS will receive a reflected signal from the MS with RDA as well as the signals transmitted from adjacent BSs and MSs. In this paper, we propose beam training technique to find the optimal beam pair in the presence of multiple BSs and MSs.
As beam training is performed in the initialization stage (or when behavioral/environmental changes occur), the optimal beam ID (BID) should be detected in the presence of symbol timing offset (STO) and carrier frequency offset (CFO) [18], [19]. In this paper, a training signal design technique based on Zadoff-Chu (ZC) sequence is investigated to perform beam training effectively for a mmWave cellular system with RDA in the presence of STO or CFO. The ZC sequence is widely used for the design of the synchronization signal, reference signal, and random-access preamble in LTE or 5G NR systems, because of its good correlation property and low PAPR. We also derive the conditions for the parameters (root index, cyclic shift/phase rotation) of the ZC sequence, which can avoid ambiguity problems during BID detection for a mmWave cellular system with RDA, in the presence of STO and CFO.
The remainder of this paper is organized as follows. In Section II, various retrodirective architectures for microwave and mmWave frequency bands are summarized. The concept of the proposed beam training technique is introduced for mmWave cellular systems using RDA. The system model for orthogonal frequency-division multiplexing (OFDM)-based mmWave cellular systems with RDA is also described. In Section III, a preamble sequence design technique is proposed for beam training in mmWave cellular systems with RDA, using the ZC sequence. In Section IV, the performance of the proposed beam training technique is evaluated through simulations, using a simple channel model. Finally, the conclusions are presented in Section V.

A. RETRODIRECTIVE ARRAY (RDA)
The RDA has the distinct advantage of being able to automatically return a signal in the direction along which it originated. Various retrodirective architectures have been proposed for VOLUME 8, 2020 microwave and mmWave frequency bands. The Van Atta array uses antenna elements placed symmetrically around a geometrical center to form a linear or two-dimensional array [20]. As each antenna element is interconnected with a transmission line of the same electrical length, the Van Atta array shows reliable and efficient performance with a relatively simple structure. However, the arrangements of the different elements are limited to symmetrical, equidistant, and planar configurations. Another architecture, the Pon array, implements phase conjugation by using a heterodyne technique based on mixers [21]. In this method, the incident signal is down-converted to an intermediate frequency (IF) by a mixer with a local oscillator (LO) frequency twice higher than the input frequency. Then, the in-phase component of the output IF signal is removed to maintain only the reversed phase component for phase conjugation. The Pon array has more design flexibility in terms of array elements because the distance between antenna elements in the Pon array is not as critical as that in the Van Atta array. As the Pon array has more freedom regarding the placement of elements, it can also be applied in spherical and cylindrical arrays. Further, if an active mixer is adopted, additional gain can be provided in the RF chain. More importantly, full-duplex communication is possible, owing to the different transmission and reception frequencies. However, the nonlinear characteristics of mixers requiring LO frequency doubled to RF might degrade the overall performance and increase the design complexity [22]. To retain stable frequency and phase in a heterodyne RDA architecture, a phase-locked loop (PLL) can be used, as shown in Fig. 1 [23]. The phase conjugation based on the use of both IQ modulator and tracking PLL can provide good cancellation of unwanted mixing products as well as phase tracking of a weak incoming signal. In Fig. 1, an input ω RF with phase φ is applied to a down-converting mixer with ω LO1 . After passing through a low-pass filter, a signal (a) is obtained. This signal is fed separately to the IQ modulator where the in-phase and quadrature signals are given by (b) and (c), respectively. Finally, the output signal (d) obtained by summing (b) and (c) contains the −φ term, which is the conjugate of the input signal phase φ. The frequencies of ω LO1 and ω LO2 are chosen to provide the desired frequency ω IF . As the structure is completely analog, the RDAs with PLL can perform high-speed tracking. Further, it has the advantage of using an LO frequency similar to the input frequency [24].
A digital RDA is also developed, where the received signal is first mixed down to a lower frequency and then applied to an analog-to-digital (A/D) converter [17], [25]. All other functions are completed using digital signal processing techniques and the phase-conjugation principle is used to achieve automatic beam tracking. Thus, the digital receiver technique and phase-conjugation concept are combined in the digital RDA. Further, a delay-locked loop (DLL) phase conjugator is applied to the RDA architectures [26], [27].
However, the digital RDA requires larger power consumption and higher circuit complexity because of the A/D conversion [28]. Especially, the power consumption increases as the sampling rate increases. For example, when the number of antenna elements is 20 and sampling rate is 20 M samples/s (SPS), the power consumptions for the analog and digital RDAs are 2 and 12 W, respectively. At the sampling rates of 0.6 and 20 MSPS, in digital RDAs, the power consumptions are 0.7 and 12 W, respectively. To apply the RDA to mmWave cellular systems, an RDA with a mixer having low LO frequency, low conversion loss in the mmWave band, and low power consumption must be considered. Further, the RDA should provide its identity in the reflected signal so that the receiver can distinguish between the signals reflected from the target station and those transmitted from adjacent MSs/BSs, to properly detect the reflector (target ID). Therefore, in this study, an analog PLL-based RDA is chosen to satisfy these requirements. Here, the frequencies of the LOs in the analog PLL-based RDA will be selected such that the signal retransmitted from the target station has a frequency shift corresponding to its ID.

B. PROPOSED BEAM TRAINING TECHNIQUE
In this subsection, we describe how the RDA can be used in mmWave cellular systems, to reduce the beam training time. Conventional beam training techniques can be divided into two different categories: exhaustive search technique and iterative search technique.  proportionally to the product of the number of Tx beams and number of Rx beams. In this figure, it is assumed that a single antenna array is available at the BS and MS. If multiple antenna arrays are available at the BS and/or MS, the training time can be reduced by transmitting multiple beams simultaneously. Fig. 3 shows the procedure of the iterative search technique. In the iterative search technique, a hierarchical multi-resolution codebook is used to construct training beamforming vectors with different beamwidths. In this technique, the beamwidths are reduced iteratively as the beamforming vectors of the next level have higher resolutions, thereby reducing the beam training overhead. In Stage 2, narrower beams generated by beamforming vectors of the next level are used for Tx beamforming. The same procedure is used in Stage 2 to find the best narrow beam pair. Although the iterative search is considered up to Stage 2 in this figure, this process can be repeated using narrower beams.
In the proposed technique, the RDA is applied to the MS. The operational concept of beam training of the proposed technique is shown in Fig. 4. Although only one BS and two MSs are shown in this figure for simplicity, the concept can be applied to multicell multiuser environments. As shown in the figure, the BS with a beamforming antenna array plays the role of the beam sweeping source in Stage 1, whereas the MS with a beamforming antenna array plays the role of beam sweeping source in Stage 2. Figs. 4(a) and (b) show the procedure in Stage 1. In Stage 1, the BS transmits a training signal sequentially in different directions. In this stage, the RDA at the MS is activated. If an MS is located in one of the beam directions, the RDA at the MS will automatically return the signal to the source (BS). Here, it is assumed that MS1 is located in the beam direction of Tx #1 while MS2 is located in the beam direction of Tx #3. In Fig. 4(a), it is shown that the RDA at the MS1 in the beam direction of Tx #1 automatically returns the signal to the source (BS). In Fig. 4(b), the RDA at the MS2 in the beam direction of Tx #3 automatically returns the signal to the source (BS). Then, the BS detects the existence of MS1 and MS2, and find the Tx beam directions corresponding to the MSs. The BS can also detect the identity of the MS (MS ID) from the received signal because the signal reflected from the RDA has a frequency shift corresponding to its ID. As shown in Fig. 1  in Stage 2 of the proposed technique, respectively. N MS is the number of MSs that require beam training. x denotes the smallest integer value greater than or equal to x. T S denotes a symbol duration.

C. SYSTEM MODEL WITH RDA
In OFDM-based cellular systems, the time-domain training signal (preamble) transmitted by the BS can be expressed as where X t,b (k) denotes the training signal transmitted from the k-th subcarrier. The preamble sequence P t,b (v k ) is allocated on the set of subcarriers {k preamble }. That is, Also, v k , t, and b denote the index of the sequence element corresponding to the subcarrier index k, Tx ID, and BID, respectively. The Tx ID t becomes the BS ID in the proposed technique. N seq , N b , and N T denote the sequence length, number of BID (simultaneous beams), and number of Tx ID, respectively. N b becomes N multi BS in the proposed technique. A preamble sequence design for a mmWave cellular system with RDA will be described in Section III.
The received signal at the input of RDA is expressed as Tx denote the channel coefficient of the l-th path, effective channel matrix, and beamforming vector for Tx ID t and BID b, respectively. φ and θ denote the AoA and AoD, respectively. ε denotes a normalized CFO between the BS and MS.
The received signal at the output of RDA is given by where ϕ and f Rx denote the additional phase shift and amount of frequency shift introduced by RDA, respectively. f Rx is used to distinguish the MSs. Note that the output signal of RDA in (3) becomes a conjugated version of the input signal because of the heterodyne RDA architecture based on mixers. The signal reflected from the RDA can be expressed as where Here, (η t,b l ) * and η t,b m denote the Tx and Rx beamforming gains, respectively, at the BS. Here, it is assumed that the signal is received through the same antenna array vector m t,b Tx used for transmission. Note that the value of m,l =m in (4) is normally small, when compared to the value of m,m , because the paths in m,l =m have different AoDs and AoAs owing to the beam misalignment between the BS and MS. Thus, in this study, we will consider only m,m and ignore m,l =m . In addition, the power difference between the line-of-sight (LoS) path and none-line-of-sight (NLoS) path in m,m will increase because the signal passes through the same channel twice, i.e., |h m | 2 . For this reason, we will consider only the LoS path ( 0,0 ) in m,m in the following discussion. Then, (4) can be simplified as Here, τ denotes the delay in the LoS path. Because (5) is a conjugated version of the Tx signal, the received signal should be conjugated before demodulation. When CFO exists, the frequency-domain representation of (5) is given by where H = e j2π3τ f Rx /N |h| 2 a t,Rx 2 2 η t,b 2 and Here, H and I k are the channel coefficient of the LoS path and interchannel interference (ICI) term of the k-th subcarrier, respectively. Note that the preamble sequence X t,b (k) is shifted in the frequency domain by f Rx because of the RDA operation.

III. PREAMBLE SEQUENCE DESIGN FOR BEAM TRAINING
In this section, a preamble sequence for beam training in a mmWave cellular system with RDA is designed. The preamble sequence is designed using the ZC sequence because of its good correlation property and low PAPR [29]. The ZC sequence has the characteristic of zero autocorrelation. The cross-correlation between the ZC sequences with different root indices is 1/ N seq . The preamble sequence is generated by applying a phase rotation or cyclic shift to the ZC sequence. P t,b 0 (v) and P t,b 1 (v) denote the preamble sequences generated by applying a phase rotation and cyclic shift, respectively, to the ZC sequence. The preamble sequence P t,b q (v), which will be used for training signal generation in (1), is defined as where 0 < β < N seq , 0 ≤ v < N seq , and Here, Z t , r (t) , and v denote the ZC sequence for Tx ID t, root index of the ZC sequence, and index of the sequence element, respectively. β denotes a phase rotation constant when q is 0 and cyclic shift spacing when q is 1. The reason the VOLUME 8, 2020 parameter β is introduced in the proposed sequence is the following. In the proposed beam training technique using RDA, a large STO may exist in the received signal because the signal reflected from the RDA at the MS is used at the BS to detect the existence of MSs. Note that the proposed beam training technique requires a round-trip transmission (BS-MS-BS) unlike the conventional beam training techniques where one-way transmission (BS-MS) is required. Also, in the proposed beam training technique, the RDA should provide its identity in the reflected signal so that the receiver can distinguish between the signals reflected from the MS and the preambles transmitted from adjacent MSs/BSs. In the proposed technique, the frequencies of the local oscillators in the RDA are used to identify MSs. The signal retransmitted from a MS has a frequency shift corresponding to its ID. The MSs can be identified incorrectly in the presence of CFO. Thus, the value of β is set to be able to distinguish BIDs in asynchronous environments (CFO, STO). The cross-correlation between the preamble (reference) signal and received (returned) signal at the receiver with Tx ID t and BID b ref is given by where 0 ≤ ξ < N seq − 1. Here, b ref denotes the reference BID. N and N seq are assumed to be the same. When ε is zero, (8) can be rewritten as where C t,b ref q Here, % denotes the modulo operator. The exponent terms of A1 and A2 are defined as The cross-correlation value R  , as summarized in Algorithm 1. Here, the subscript D denotes the detected value that maximizes the cross-correlation in (8). The parameters {b D , f D , τ D } for beam alignment are obtained from {ξ D , t D } using (11). (11) From (4), it can be seen that the output of RDA is given by a conjugated form of the input. Thus, if Tx ID is t, the root index of the signal returning from RDA will become (N seq − r (t) ). However, the root index (N seq − r (t) ) can be the root index of another BS. Thus, the conjugation property of RDA can cause a misdetection in cellular systems because the signal returning from the RDA cannot be distinguished from the signals transmitted from the adjacent BSs with root index (N seq − r (t) ). To solve this problem, the frequency shift f Rx in the RDA is set to nonzero values so that the signal returning from the RDA can be distinguished from the signal transmitted from the adjacent BS or that reflected from obstacles.   , N b , N seq , b ref , β, and r (t) are set to 2, 31, 1, 16, and 13, respectively. The values of {b, f Rx , τ } for two RDAs are set to {0, −2, 0.5} and {0, 2, 0.5}, respectively. From (11), it can be seen that ξ t,b ref q corresponding to {0, −2, 0.5} and {0, 2, 0.5} becomes 11 and 21 when q is 0, and 5 and 15 when q is 1. From Fig. 5, it can be seen that the peak positions are the same for both analysis and simulation results.
As . Ambiguity occurs when the following condition on root index or β is satisfied: where Here, g 0 and g 1 denote arbitrary constants introduced to replace the modulo operations in (11). The terms g 0 and g 1 are associated with {b 0 , f Rx,0 , τ 0 } and {b 1 , f Rx,1 , τ 1 }, respectively. The values of f , b, and τ range from f min to f max , (−N b +1) to (N b −1), and −τ max to τ max , respectively. Here, f min , f max , and τ max denote the minimum value of f , maximum value of f , and maximum transmission delay. The range of (g 1 − g 0 ) in (12) is given by Here, f Rx,min , f Rx,max , and τ min denote the minimum value of f Rx , maximum value of f Rx , and minimum transmission delay. x denotes the largest integer not exceeding x. The amount of frequency shift f Rx is given by an integer value ranging from −f Rx,max to f Rx,max , excluding zero. r (t) max (the largest value of r (t) ) is N seq − 1. The root index set for the ZC sequence, constructed by eliminating the root indices satisfying the ambiguity condition in (12), is defined as 0 . Ambiguity occurs when the denominator in (12) becomes zero. When q is 0, the denominator f in (12) becomes zero if the terms f Rx r (t) in (11) for different Rx beams are the same. In this case, the peak location of the beams with different b and τ will be the same. Similarly, when q is 1, the denominator f − bβ in (12) becomes zero if the terms (11) for different Rx beams are the same (with different values of f Rx and b). This ambiguity can be avoided by selecting the value of β such that the denominator becomes a non-zero value. The ambiguity condition on β is summarized in (13). Thus, the ambiguity problem can be avoided by eliminating the value of β satisfying the ambiguity condition in (13). When CFO exists, the ICI occurs in OFDM systems as given in (6). The ICI has the effect of decreasing the main peak and increasing the side peaks in the cross-correlation plot. Fig. 6 shows the effect of ICI when CFO (ε = 0.2) is present in Fig. 5 (no CFO). As can be seen in Fig. 6 (simulation), the main peaks decrease, and side peaks are generated at other positions, resulting in misdetection of the parameters in Algorithm 1. To reduce the ICI effect, a frequency margin ( mar ) is considered when the parameter detection is performed using the peak position, at the receiver side. Note that the index ξ t,b ref q corresponding to the peak value of correlation is given as a function of the frequency shift f Rx , as shown in (11). As the received frequency corresponding to frequency shift f Rx is changed depending on the value of CFO, the search space for frequency shift is changed considering the frequency margin. For example, when two different values of f Rx are considered with mar = 1, the candidate values of f Rx can be [−2, 2] because the frequency shift in the RDA should be set to nonzero values. In this case, f mar (f Rx ± mar ) becomes [−3, −1, 1, 3]. As mar increases, the effect of ICI decreases. However, the number of available root indices is reduced as mar increases.
Algorithm 2 is developed to detect the parameters for beam alignment when CFO exists. In Step 1 and Step 2, the peak position (ξ D ) that maximizes the cross-correlation between the reference signal and returned signal is determined using (8). If the detected peak position ξ D is one of ξ mar corresponding to f mar , the following steps (Step 3 -Step 7) are performed because there is a high possibility of peak position shift due to the CFO. The values of ξ mar corresponding to f mar can be obtained from (11). If ξ D is different from ξ mar , 3. If ξ D ∈{ξ mar }, ξ mar : peak location corresponding to f mar 4. f Rx, cand 0 = f D − mar and f Rx, cand 1 = f D + mar 5. Calculate {ξ C n , t C n } (except f mar and zero) between f Rx, cand 0 to f Rx, cand 1 using (11) 6. Estimate {ξ D , t D } = arg max (11) Step 8 is executed because the effect of ICI is not large enough to shift the peak position. In Step 4, the candidate frequencies, f Rx, cand 0 and f Rx, cand 1 , are obtained by adding − mar and + mar , respectively, to f D . In Step 5, all possible peak positions ξ C n corresponding to the candidate frequencies between f Rx, cand 0 and f Rx, cand 1 are calculated using (11). Here, n is set to 0 for f Rx, cand 0 and 1 for f Rx, cand 1 . The peak positions corresponding to f mar and zero are not calculated because f mar and zero are not in the list of frequency shifts in RDA. In Step 6, among all possible peak positions ξ C n , the peak position that maximizes the cross-correlation is obtained and set to ξ D . The other parameters {t D } are also obtained in this step. In Step 8, the parameters {b D , f D , τ D } corresponding to ξ D are obtained using (11). As the frequency shift f Rx is changed to f mar in Algorithm 2, considering the frequency margin mar , the root index set for the ZC sequence should be changed accordingly. A root index set for Algorithm 2 is defined as 1 and is constructed by eliminating the root indices, producing ambiguity in parameter detection owing to f mar , from 0 (root index set in Algorithm 1). The ambiguity condition in (12) is used to find the root indices corresponding to f mar .

IV. SIMULATION
In this section, the performance of the proposed beam training technique for mmWave cellular systems with RDA is evaluated using simulations. The BS and MS are assumed to be equipped with a uniform planar array (UPA) with 16 ×16 elements (N b BS = 256) and a uniform linear array (ULA) with 8 elements (N b MS = 8), respectively. It is also assumed that the RDA used in the proposed technique is an analog PLL-based RDA with retransmission power of 0 dBm [24]. The carrier frequency, subcarrier spacing, length of cyclic prefix, size of fast Fourier transform (FFT), and length of ZC sequence (N seq ) are set to 28 GHz, 120 kHz, 0.57 µs, 1024, and 31, respectively. The roundtrip delay (2τ ) between the BS and MS ranges from 0 to 33 samples. The azimuth angle and elevation angle for the target MS are set to 20 • and 90 • , respectively. The channel between the BS and MS is assumed to experience Rician fading, consisting of an LoS path and NLoS path. The NLoS path is generated using a spatial channel model (SCM) and is composed of 20 rays with 2 • azimuth spread. The k-factor is set to 15 dB.  The frequency shift f Rx = 0 is not used so that the signal returning from the RDA can be distinguished from the signal transmitted from the adjacent BSs. In the proposed technique, the parameters {b, f Rx , τ } corresponding to the correlation peak are detected using Algorithm 1. From Fig. 7, it can be seen that the performance of the proposed technique with r (t) of 13 approach 100% at an SNR of −34 dB and is similar to that of the conventional technique. However, the performance degrades significantly when r (t) is 27. This result can be analyzed using the ambiguity condition in (12). An example of (12) is shown in Table 2 is uniquely mapped to {b, f Rx , τ }. In Table 2, the transmission delay is expressed in terms of both the number of symbols and delay time (symbols/msec). Here, the symbol duration (T S ) is set to 8.33 µs.   respectively. As f Rx is set to [−2, 2], f mar will be [−3, −1, 1, 3]. The root index of the ZC sequence is set to 13 (no ambiguity and 1 ). The value of ε is changed from 0 to 0.4. From this figure, it can be seen that the performance of Algorithm 1 is degraded as ε increases, because the effect of CFO is not considered in Algorithm 1. However, the performance degradation is relatively small when Algorithm 2 is used. From this figure, it can be seen that Algorithm 2 is robust in parameter detection when CFO exists. Fig. 9 compares the number of symbols required for beam training using the formula in Table 1 are set to 4, 24, 1, and 1, respectively. As can be seen in this figure, the number of symbols required for the proposed technique is significantly smaller than those required in the conventional techniques. For example, when N b BS = 256 (UPA with 16 × 16 elements), the numbers of symbols required for exhaustive search, iterative search, and proposed technique are 2048, 1568, and 288, respectively. The difference increases as the value of N b BS increases.

V. CONCLUSION
In this paper, a beam training technique that can significantly reduce the beam training time in mmWave cellular systems using RDA is proposed. In the conventional techniques, the processing time for beam training increases proportional to the product of the number of Tx beams and number of Rx beams, because Tx/Rx beam sweeping is performed for each Rx/Tx beam. However, in the proposed technique, the processing time is proportional to the sum of the number of Tx beams and number of Rx beams, because only one Tx beam sweeping and one Rx beam sweeping are required, owing to the use of RDA. A preamble sequence design technique for beam training is proposed to identify the BS, MS, and beams simultaneously transmitted from the BS/MS, using the ZC sequence. It is shown that ambiguity may occur in detecting the parameters for beam alignment because of the inherent nature of the ZC sequence. The ambiguity condition and detection algorithms are derived so that we can uniquely identify the parameters for beam alignment in the presence of STO and CFO. It is shown through simulations that Algorithm 2 can correctly detect the parameters for beam alignment in mmWave cellular systems with RDA when CFO exists. The performance of the proposed technique is shown to be similar to that of the conventional technique (exhaustive search). However, the proposed technique with 16 × 16 UPA requires only 14% of training symbols, respectively, when compared to the conventional technique (exhaustive search).