Deep-Learning-Based Physical-Layer Lightweight Authentication in Frequency-Division Duplex Channel

This letter proposes a lightweight authentication scheme based on secret key generation for frequency-division duplexing. Firstly, a base station predicts downlink channel state information (CSI) from uplink CSI with the aid of deep learning. Then, a secret key is shared between the BS and a mobile user by quantizing the downlink CSI. Since this key generation method uses physical-layer features, the costs of the calculation complexity, the key distribution, and the management, which are typically imposed by the conventional upper-layer key generation, are significantly reduced. Furthermore, the generated key is utilized to carry out low-latency and low-complexity authentication, which is suitable for Internet of things applications.


I. INTRODUCTION
T HE communication improves toward beyond 5G (B5G) standard [1] is expected to support a diverse number of Internet of things (IoT) devices and is also necessary for boosting fundamental wireless performance. Furthermore, maintaining IoT security [2] against denial of service attacks, resource consumption, masquerade attacks, replay attacks, information disclosure, and message modification is also vital in B5G [3] [4], [5], [6]. In order to encrypt information, a secret key is commonly shared between two legitimate users by public-key algorithms, such as RSA [7] or elliptic curve discrete logarithm problem (DLP) [8] ones. However, public-key algorithms cannot achieve quantum resistance and typically impose high encoding/decoding complexity, especially when employed for encryption with a long public key. Hence, key exchange by a public-key algorithm may not be suitable for future IoT devices in terms of security and energy efficiency.
By contrast, as a part of physical-layer security, the concept of secret key generation (SKG) has been extensively investigated [9], where the costs of key sharing/managing, the overhead, the latency, and the encoding/decoding complexity are reduced [10]. More specifically, a secret key is generated by the quantization of channel state information (CSI) shared between two legitimate users [11]. In most previous SKG Manuscript received 18  studies [12], the use of a reciprocal time-division duplex (TDD) channel is assumed to allow two legitimate users to share the channel information and acquire the same secret key from quantized CSI. To relax the reciprocal channel constraint, SKG in a non-reciprocity frequency-division duplex (FDD) channel was developed [13], [14], where the correlation between the uplink and downlink channels is exploited. As an additional benefit, SKG in FDD allows a reduction of latency for the SKG in comparison to its TDD counterpart. The use of secret keys for authentication has been considered with the aid of a public key infrastructure (PKI) [15], which uses a pair of a secret key and a public key in the publickey algorithm, whose security performance depends on the difficulty of the mathematics problems, such as factoring problems or discrete logarithm problems, or elliptic curve DLP. To avoid the complexity inherent to a PKI, an authentication method using a physical-layer feature was proposed with the aid of the confirmation of characteristic distortion [16], where a secret key in TDD is used to assign time slots to each user. In this scheme, the complexity and the latency are significantly low. However, to the best of our knowledge, physical-layer authentication in a non-reciprocal FDD channel, where the uplink and downlink channels are not the same, has not been developed, nor has a detailed analysis of the proposed method been performed.
Against this background, the novel contributions of this letter are as follows. We propose a lightweight authentication scheme based on SKG in a non-reciprocal FDD channel. More specifically, the downlink channel is estimated from the uplink pilot symbols with the aid of DL at a base station (BS), while a user directly estimates the downlink channel from the pilot symbols transmitted from the BS. This allows us to reduce the feedback information typically needed for conventional SKG in the FDD channel, hence reducing the risk of information leakage to an eavesdropper, as well as reducing the delay. Furthermore, a secret key is generated and shared between the BS and each user by quantizing the associated downlink channel coefficients. In the presence of fading, an attacker cannot access the associated downlink channel, and hence the generated key is quantum resistant while benefiting from significant reductions in latency and computational cost, as well as power consumption in comparison to the conventional public key cryptography [17], [18]. Moreover, lightweight authentication is invoked for IoT communication in the scenario of low-latency FDD grant-free access. The BS authenticates each legitimate user when a user transmits data in the time slots allocated by the physical-layer secret key.  Fig. 1 illustrates a multipath channel model considered in this letter in which each path is the same in uplink and downlink, similar to [13]. The BS is equipped with M antenna elements in the form of a uniform linear array (ULA), and each user is equipped with a single antenna element. In this letter, only a single-user scenario is considered for the sake of simplicity, but this model is readily applicable to a multiantenna multi-user scenario.

A. FDD Non-Reciprocity Channel Model
The channel is assumed to consist of P paths, and the direction of each path from the BS to the pth scatterer obeys the uniform random distribution over where ∆θ is an angular spread (AS). The channel vector at the carrier frequency of f between the BS and the user is represented by where α p , ϕ p , τ p , and θ p are the attenuation, phase shift, delay, and direction of arrival (DOA) for the pth path, respectively. Moreover, a(θ p ) is the array manifold vector, defined as where we have χ = 2πdf /c, while d is the antenna spacing of the ULA at the BS and c is the speed of light. Note that α p is a function of the length l p of the pth path, which is represented by The phase ϕ p depends on the scatter materials and angles of the incident wave. The delay τ p is calculated based on the distance traveled by the signal along the pth path. In this letter, we consider the carrier frequencies f U and f D for uplink and downlink in an FDD channel, respectively. Hence, it can be seen from (1) that the uplink and downlink channel vectors h(f U ) and h(f D ) are non-reciprocal. This implies that simple quantization of each channel vector, which is typically considered in a TDD scenario, does not generate identical secret keys in an FDD channel.

B. Deep-Learning-Based Downlink Channel Estimation at BS
In this section, we introduce DL-aided channel estimation of the downlink channels from the uplink channels for the FDD system at the BS [12], [13]. More specifically, our neural network model to be trained, the received pilot signals, the loss function used for our DL, and the optimization algorithm are given. The channel vector estimated by our neural network is given byĥ where Also, L is the artificial neural network (ANN) number of layers, andh(f U ) represents the uplink channel vector estimated by the traditional (non-DL) algorithm at the BS. Moreover, f (l) is a nonlinear transformation function, such as the rectified linear unit (Relu) function [19], in the lth layer, which is written as where W (l) ∈ C M ×M and b (l) ∈ C M are the parameters of a neural network, which are trained according to our algorithm presented below. Furthermore, g is the activation function, which is given by where ℜ[·] and ℑ[·] are the real and imaginary parts of a vector, respectively.
In the proposed downlink CSI estimation, we have two stages, namely, the training stage and the deployment stage. In the training stage, during the channel coherence time, the BS transmits a pilot symbol from each antenna element to the user while the user also transmits a pilot symbol to the BS, which is repeated T times. The associated received signals at the BS and the user are modeled, respectively, by where x comprises a pilot symbol, while n (t) BS and n (t) u are both additive white Gaussian noise that obeys the complex-valued Gaussian distribution CN (0, σ 2 ). Also, σ 2 is the noise variance, and the transmit power of a pilot symbol is represented by P tx = E[|x| 2 ]. Furthermore, E[·] represents the expectation operation.
The downlink channel vectorh(f D ) estimated at the user, based on a traditional algorithm, such as zero-forcing (ZF) or minimum mean-square error (MMSE), is fed back to the BS. Then, the neural network is trained based onh(f D ) andh(f U ), which are the response variable and the explanatory variable, respectively. More specifically, the neural network is trained to minimize the loss function where is the downlink channel vector, estimated by DL at the BS. Also, ∥·∥ 2 denotes the l 2 norm. In our scheme, the parameters of the neural Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. network, Ω = {W (l) , b (l) } L−1 l=1 , are optimized by minimizing the loss function Loss(Ω) with the aid of the adaptive moment estimation (ADAM) algorithm [20]. In the deployment stage, the BS and the user send pilot symbols to each other, and the BS estimates the uplink channel vector h(f U ) based on the MMSE algorithm ash(f U ). Then,h(f U ) is input into DL to outputĥ(f D ), where the parameters optimized at the training phase are used.

C. SKG From Estimated Downlink Channel Vector
While the BS estimates h(f D ) according to the DL-based scheme of Section II-B, the user estimates h(f D ) from the pilot symbols transmitted from the BS. Then, a secret key is generated by quantizing the estimated downlink channel vector at the BS and the user. In this letter, only the phase information for the estimated downlink channel vector is used for the sake of simplicity. In order to attain quantization of an n-bit secret key per channel, 2 n -level phase demodulation is carried out. Therefore, an (M × n)-bit secret key is generated for each channel vector h(f D ) ∈ C M .

D. Allocation of Active Time Slots
The BS authenticates the legitimate user by L 1 specific active time slots within a frame, which are allocated based on the shared secret key. More specifically, there is L2 C L1 combination to specify L 1 out of L 2 (≥ L 1 ) time slots. Hence, a shared secret key with the length of n = ⌈log 2 L2 C L1 ⌉ bits are used for the authentication between the BS and the user.
Compared with the conventional public-key-based cryptographic methods, the proposed scheme does not require the rounds of the system setup, key generation, distribution, refreshment, or revocation, as well as the presence of a third-party certificate authority. Different from the physical-layer authentication schemes using distortion characters [21], the proposed scheme provides robust continuous authentication without any continuous parameter update. Furthermore, the proposed scheme enables continuous authentication by simply checking the active time slots without generating or verifying the secret keys periodically, hence achieving lightweight authentication. In the proposed framework, the user sends data to the BS in a grant-free manner. Under the presence of uncorrelated fading, spoofers cannot send data in the same time slots as those of the legitimate user, which are activated by the secret key in our scheme 1 . Fig. 2 illustrates a successful case of authentication based on active time-slot allocation. A spoofer's attack may be successful only when the spoofer instantaneously specifies all the active time slots per frame.

III. PERFORMANCE RESULTS
In this section, we provide our performance results to characterize the proposed scheme. The channel model between the BS and the user considered in our simulations is illustrated    Fig. 3. For simplicity, scatterers are positioned in a line along the x-axis, which reflects a wave with no amplitude attenuation or phase rotation. In our simulations, P scatterers are uniform-randomly selected from the line per channel generation of h(f U ) and h(f D ). The y-axis distance from the scatterers to the BS, as well as that from the scatterers to the user, is d y , as shown in Fig. 3. Note that θ M and θ m are the maximum and minimum angles of θ p ∈ [θ m , θ M ]. The distance between the user's antenna element and the closest BS antenna element is given by d x = d y tan θ M + d y tan θ m . The length of the scatterers d w is given by the angular spread ∆θ = θ M − θ m and the distance d y as follows: d w = d y tan θ M − d y tan θ m . The edges of the scatterers are given by (x, y) = (tan θ m , y d ) and (tan θ M , y d ). Moreover, similar to most previous studies, we assume the absence of a BS-touser direct link due to blockage by an obstacle, and hence the generated channel tends to obey Rayleigh fading.
As also listed in Table I 5. KER between the secret key generated at the BS and that of the user, where we considered n = 1, 2, and 3. The energy-based SKG and the PASKey scheme were employed as the benchmarks.

A. Performance of Deep-Learning-Based Channel Estimation
In our DL-based estimation of the downlink channel vector at the BS, each input and output layer has 32 nodes, where we train the network to predict The nodes in the hidden layer are set as (64, 128, 64), and the nodes of the five layers are represented by (32, 64, 128, 64, 32). We employed the ADAM algorithm [20] with a learning rate of 0.001 for optimization and 100 epochs. The training data were collected with T = 512 in (7) and (8). Fig. 4 shows the normalized mean-square error (NMSE) of the estimated downlink channel vector at BS, which is defined as follows: Here, the distance d y was set as 100 m, 200 m, and 500 m. Observe in Fig. 4 that even for a high ∆f , accurate channel estimation was achievable, especially for a low distance d y .

B. SKG Performance
Fig . 5 shows the key-error ratio (KER) between the generated secret key at the BS and that of the user. We considered the two benchmark schemes, i.e., the energy-based SKG scheme without DL and the pilot assistant secret key generation (PASKey) scheme [22] that relies on amplified feedback. More specifically, in the energy-based SKG scheme, the amplitudes of the downlink channel coefficients are regarded as the amplitudes of the uplink ones, which are estimated from the pilot signals transmitted from the user. Then, the estimated amplitude is quantized to generate a secret key. Furthermore, in the PASKey scheme, additional amplified pilot feedback from the user allows us to estimate the downlink channel coefficient in a stable manner while suffering from the doubled latency, the noise amplification of the feedback signal, as well as information leakage to an eavesdropper, unlike the proposed scheme. 2 As shown in Fig. 5, upon decreasing the generated key length n per channel coefficient and the frequency difference ∆f , the KER improved. The proposed scheme outperformed the energy-based benchmark scheme without DL, where the performance advantage increased with the increase of ∆f . The KER of the idealistic PASKey scheme remains unchanged regardless of ∆f while suffering from information leakage to the eavesdropper, as well as the increased SKG latency. Note that while information reconciliation with channel coding and privacy amplification with a hush function is typically implemented to improve the reliability of SKG [11], we considered only a channel-uncoded scenario for simplicity. Fig. 6 shows the authentication-error ratio (AER), where an authentication error is counted when all the time slots randomly generated by a spoofer match those activated in the proposed scheme and when the generated secret key at the BS does not agree with that of the user. We assumed that the spoofer knows the ratio of the activated time slots over the time slots per frame p = L 1 /L 2 , where the ratio was set as p = 0.125, 0.25, and 0.5. The quantization level was given by n = 1. Observe in Fig. 6 that the proposed scheme exhibited benefits similar to those shown in Fig. 5 while maintaining Outage probability comparisons between the proposed scheme, the energy-based SKG scheme, and the PASKey scheme, where the system parameters of (n, p) = (1, 0.5) were employed while considering the receive SNRs of 0 dB, 10 dB, and 25 dB.

C. Authentication Performance
the lower overhead and latency in comparison to the PASKey scheme. Fig. 7 shows the outage probability of the proposed scheme, which is affected by either authentication or data detection. More specifically, an unsuccessful event for data detection is induced when at least one symbol in each frame is mis-detected at the receiver due to the effects of fading, AWGNs, and channel estimation errors. Also, the definition of authentication error is the same as that used in Section III-C. We considered 16 symbols in each frame while we set p = 0.5 and n = 1 while the average SNR was given by 0 dB, 10 dB, and 25 dB. The modulation scheme was quadrature phase-shift keying, and the ZF algorithm was used for demodulation. The other parameters are the same as those used in Fig. 6. As shown in Fig. 7, the outage probability improved upon decreasing ∆f while outperforming the energy-based SKG scheme in each scenario. More specifically, the proposed scheme's performance benefits increased upon decreasing the receive SNR.

IV. CONCLUSION
In this letter, we proposed DL-based physical-layer channel estimation and lightweight authentication in a non-reciprocal FDD channel. In our scheme, the downlink channel is estimated from the uplink pilot symbols based on DL at the BS, hence reducing the feedback information, the delay, and the information leakage to an eavesdropper, in comparison to the conventional SKG assuming the reciprocal channel. Each legitimate user is authenticated when a user transmits data in the time slots allocated by the physical-layer secret key. Our performance results demonstrated that our authentication functioned while achieving lower latency and error rates in an FDD fading channel than the conventional energy-based benchmark scheme.