Low Complexity LSTM-NN-Based Receiver for Vehicular Communications in the Presence of High-Power Amplifier Distortions

Vehicular communications are an important focus of studies for 5G applications and beyond. However, in a scenario with doubly-selective and highly variable channel characteristics, tracking the wireless channel to ensure communication reliability is one of the main goals to provide communication efficiency. Moreover, multicarrier modulation schemes usually employed in these scenarios are susceptible to nonlinear distortions caused by high power amplifiers (HPA) at the transmitter, impairing the channel estimation and detection capability of the receivers. In view of these requirements and challenges, in the present work we propose a low complexity estimator based on the long short-term memory (LSTM) network, followed by a neural network (NN) in order to improve the data-pilot aided (DPA) estimation. In addition, we propose a new technique to exploit the characteristics of the vehicular channel, by sampling the subcarriers used at the input of the LSTM. Thus, besides tracking the variations of the wireless channel, the LSTM network is also used to interpolate the channel estimates for all subcarriers. The simulation results show the superiority of the proposed scheme in comparison with other state-of-the-art schemes, especially in high signal-to-noise ratio (SNR) regimes. Furthermore, the proposed scheme significantly reduces the computational complexity due to the subcarrier sampling procedure.

vehicular communication based on the orthogonal frequency division multiplexing (OFDM) scheme, with channel estimation supported by pilot subcarriers.
Due to the limited number of data pilots, several methods in the literature have been proposed to improve the channel estimation in vehicular networks. Most methods for IEEE 802.11p networks are based on the data-pilot aided (DPA) scheme, which exploits the demapped data symbols in order to improve the channel estimation, thus providing a low computational complexity solution [3], [4], [5]. However, the performance of these schemes is heavily influenced by the data pilots' reliability, which tends to degrade given the harsh dynamic of vehicular channels. In addition, classical DPA-based methods incur error propagation during the frames, a problem that is even more significant in high-order modulation schemes and high-mobility [6].
In view of the challenges of vehicular communication networks, deep neural network (DNN)-based schemes have been successfully employed recently to improve the channel estimation for vehicular channels. For instance, an autoencoder (AE)-DNN was proposed by [7] in order to improve the DPA method. The DPA-DNN scheme trains a DNN offline, aiming at reducing the error propagation by correcting the errors between the initial DPA estimation and the perfect channel. Convolutional neural networks (CNNs) have also been considered as a solution for vehicular scenarios [8], [9]. The TS-ChannelNet estimator introduced by [8], e.g., suffers from high computational complexity, since it considers integrating both LSTM and CNN networks to achieve the final channel estimates. The authors in [9] present an estimator based on weighted adaptive interpolation, which is able to reduce the complexity and at the same time outperforms TS-ChannelNet, given a modification considered in the IEEE 802.11p standard to allocate the pilots within each transmitted frame, adapting the scheme according to the mobility condition. However, by considering frame-by-frame solutions, both CNN-based receivers require reception of the whole frame before starting the channel estimation, thus increasing the latency and limiting its performance for real-time applications [10].
Moreover, other recent studies have considered more advanced deep learning (DL) algorithms to explore the correlation between OFDM symbols. As it was shown in [11] and [12], DL is able to capture more features of the vehicular channel and to improve the estimation performance compared to conventional methods. In this sense, a promising approach relies on the long short-term memory (LSTM) network, which was introduced by [13] as a neural network with feedback connections, capable of handling sequential information where there is correlation over time. Thus, the LSTM can be a robust and efficient DL solution to track the vehicular channel, especially in high mobility scenarios. Nevertheless, the LSTM architecture still poses a significant challenge related to its high complexity. For instance, the authors in [11] combined the DPA estimation with an LSTM layer followed by a multilayer perceptron (MLP) network. The proposed LSTM-NN-DPA estimator outperforms previous DNN-based estimators in terms of channel estimation. However, such performance gain is attained at the cost of huge computational complexity. Reducing part of the LSTM complexity was addressed by [12], where the proposed channel estimation scheme uses only one LSTM layer within the DPA, while the residual estimation noise is alleviated using a temporal averaging (TA) post-procedure.
A common aspect of the works in [3], [4], [5], [7], [9], [11], and [12] is the consideration of a linear communication environment, assuming an ideal radio frequency (RF) interface. In spite of the OFDM advantages, this modulation introduces challenges related to its high peak-to-average power ratio (PAPR) [14], leading to nonlinear distortions at the high power amplifier (HPA) output signal at the transmitter. Many different compensation techniques have been proposed aiming to reduce the effect of these imperfections. At the transmitter side, a digital pre-distortion (DPD) block is commonly adopted in order to linearize the output signal [15]. However, such linearization task is not trivial to be optimally performed, while occurring at a complexity cost. As an alternative, the HPA nonlinearity can also be compensated at the receiver side, where it may be possible to reduce the power consumption [16].
DL-based processing has been shown to be an efficient tool to compensate HPA nonlinear effects at the receiver, given the nonlinear nature of the DL architectures and thanks to their generalization properties [17], [18]. In this context, we have compared in [19] different conventional vehicular channel estimators and DL-based methods, with the effect of the nonlinear amplification of OFDM signals based on the polynomial distortion model developed by [20] and [21]. Results show that DL-based receivers are intrinsically more robust to the HPA-induced nonlinearities, providing reliable channel estimates even in high-mobility scenarios. Nevertheless, the effort in [19] only highlights the robustness of hybrid estimators that combine DNNs with conventional methods, inspiring this work toward designing novel receiver architectures.
Furthermore, another key characteristic of vehicular communication channels is related to a certain smoothness in the frequency domain, which can be exploited to reduce complexity of the channel estimation at the receiver. For instance, the authors in [22] propose a CNN-based channel estimation and phase noise compensation scheme for doubly-selective channels (in time and frequency) by considering only part of the pilots from the channel. The channel estimation process is treated as an image completion problem, so that the proposed solution is shown to be robust enough to track the channel variation in both frequency and time domains. In addition, the work in [23] also exploits the frequency response smoothness to perform channel estimation. The proposed scheme is based on a truncated discrete Fourier transform interpolation, which uses only the dominant channel taps from the channel delay profile to perform estimation. As their main result, the proposed estimator outperforms conventional methods that employ all data subcarriers to obtain the channel estimation, while also having a decreased computational complexity. An overview of the literature for IEEE 802.11p channel estimators and their respective techniques is presented in Table 1.
In this paper, we propose a novel receiver for vehicular communications subject to HPA-induced distortions, exploiting the features of the channel in the frequency domain. It is worth pointing out that both nonlinear and the relative smoothness in the frequency domain characteristics of the wireless channel are crucial for practical vehicular communications scenarios, and have not been well explored in the literature yet. In our proposed method, a first estimation given by the DPA method is fed to an LSTM layer, which will track the channel variation and learn the channel correlation in the time domain. The LSTM is then followed by a shallow neural network (NN) in order to enhance the denoising capability. Such combination of the DPA, the LSTM layer and the NN is key to dealing with the HPA distortions at the receiver. Therefore, we denote our proposed scheme by DPA-LSTM-NN. In addition and unlike previous works, we exploit the channel response in the frequency domain in order to reduce the LSTM size. To that end, we employ a subcarrier sampling at the input of the LSTM, so that the interpolation of the missing subcarriers' information is performed by the LSTM itself. The main contributions of this paper are summarized as follows: • The proposed DPA-LSTM-NN estimator exhibits robust performance in the presence of HPA-induced nonlinearities. The numerical results show that the DPA-LSTM-NN proposal outperforms DPA-DNN [7], LSTM-NN-DPA [11] and LSTM-DPA-TA [12] schemes both in terms of bit error rate (BER) and normalized mean square error (NMSE). For instance, we show that a BER of 10 −4 can be achieved only with the proposed scheme in some situations, depending on the employed modulation order and velocity.
• The obtained results show that the DPA-LSTM-NN scheme outperforms other methods from the literature regardless of the velocity level. In addition, only a slight performance degradation of the DPA-LSTM-NN is observed in very high-mobility scenarios (up to 200 km/h).
• A significant reduction of the computational complexity is obtained by sampling the subcarriers at the input of the LSTM layer. Our proposed DPA-LSTM-NN scheme is the least complex scheme, measured in terms of the number of required real-valued operations, compared to [7], [11], and [12]. The remainder of this paper is organized as follows.
The system model is presented in Section II, including the main characteristics of the HPA nonlinear distortion model and the vehicular channel model. The proposed DPA-LSTM-NN channel estimator with subcarrier sampling is detailed in Section III, while other benchmark DL-based channel estimation schemes are described in Section IV. Results and discussions are presented in Section V and Section VI concludes the paper. Finally, for convenience, the acronyms and symbols adopted in this work are summarized in Tables 2 and 3, respectively.

II. SYSTEM MODEL
Let us consider the IEEE 802.11p standard [2] as the basis for our analysis, which employs an OFDM modulation in order to enable vehicle-to-vehicle (V2V) and vehicleto-infrastructure (V2I) communication. As illustrated by Figure 1, each transmitted packet consists of a preamble, a signal field, which carries the physical layer information, and a data field. The preamble includes short and long training symbols, known by the receiver in order to conduct the channel synchronization. In addition, the long training symbols are divided into two predefined sequences t p,1 and t p,2 , used for channel estimation. Moreover, a cyclic prefix (CP) is used to absorb the inter-symbol-interference (ISI) caused by the multi-path propagation.
We denote K on as the set of active subcarriers, where K on = |K on | is the cardinality of the set. Then, for each active subcarrier k ∈ K on within the i-th OFDM symbol, the demodulated OFDM frame in the frequency domain can be expressed as where ε p is the preamble power per sample, ξ the average signal-to-noise ratio (SNR) at the receiver and K is the total number of subcarriers employed within each OFDM symbol (note that K > K on ). The coefficients of the channel response H[k, i] are modeled according to a Rayleigh fading distribution with Jakes' Doppler spectrum and a Doppler frequency given by [25] where ν is the velocity of the vehicle in m/s, c is the speed of light in m/s and f c is the carrier frequency. In order to lighten the notation considered hereafter, we rewrite (1) to represent  the received OFDM symbols for the k-th subcarrier in the i-th transmitted data symbol as where u i [k] denotes the k-th subcarrier in the i-th transmitted OFDM data symbol at the output of the HPA, subjected to nonlinear distortions, as follows.

A. HIGH POWER AMPLIFIER DISTORTION MODEL
To model u i [k], let us denote the signal at the input of the HPA as x i [k], so that we have a non-compensated HPA outputũ i [k] given byũ where δ i [k] is a nonlinear distortion with zero mean and variance σ δ 2 that is uncorrelated with the input, while γ 0 is a complex gain. Then, in order to model the HPA nonlinear distortions in (5) we follow [20], focusing on a memoryless HPA. The advantage of such model is that it characterizes both amplitude to amplitude (AM/AM) and amplitude to phase (AM/PM) distortions, while it fits a commercial evaluation of a HPA from the 3GPP [26] into a polynomial.
This model shows that the HPA response is usually constant over the useful signal frequency band, allowing us to neglect the memory effect of the HPA on the channel. In addition, phase compensation can be assumed to be perfectly done at the receiver, as the standard in several works in the literature [27], [28]. Furthermore, the key component in this analysis is the Bussgang's Theorem [29], which states that if the input signal at the HPA has a Gaussian distribution, as the case of an OFDM symbol with a sufficiently large number of subcarriers, the output signal of the HPA can be written as (5). Furthermore, the accuracy of the considered model has been validated in the literature [20], [21].
In practice, in order to reduce the effects of the nonlinearities, the HPA operates at a given input back-off (IBO) from the 1 dB compression point, which refers to the input power level where the characteristics of the amplifier have dropped by 1 dB from the ideal linear characteristics [30]. Therefore, the input signal x i [k] is scaled by the gain before being amplified by the HPA to ensure the desired IBO, given by where τ 1dB is the input power at 1 dB compression point, is the mean power of the input signal, and the IBO is given in dBs. where ) represent the AM/AM and AM/PM characteristics of the HPA, while the complex soft envelope of the amplified output signalũ i [k] is given by In our work we consider that the soft envelope of the amplified signal is approximated by in which a l denotes the complex coefficients from the P o -order polynomial used to approximate the HPA model, obtained with the least square (LS) method [20]. As a consequence, the input/output relationship of the HPA is approximated bỹ Finally, we assume perfect estimation and compensation of γ 0 . Thus, we can write the output of the HPA as u i [k] = u i [k]/γ 0 , which usually yields a BER floor at the receiver due to the residual nonlinear distortion of the HPA. Figure 2 illustrates the transmission system modeled in the presence of the nonlinear HPA.

B. VEHICULAR CHANNEL MODEL
We consider the vehicular channel model described in [31], where the authors provide the Doppler-delay characteristics of different environments. The characterization is based on real measurements with one or two vehicles moving under different velocities, which models V2I and V2V scenarios, respectively. The channel models are considered with a tapped-delay line, where each tap is statistically described by a Rayleigh fading distribution with a Doppler power spectral density. Throughout this paper, we consider the urban canyon (UC) model with two vehicles communicating with each other, i.e., the V2V-UC channel model. Table 4 describes the power delay profile (PDP) of the employed V2V-UC channel, while Figure 3 illustrates its channel frequency response for a velocity v = 48 km/h. From the figure we can observe that V2V-UC channel presents a smooth variation in the frequency domain. This characteristic will be exploited to down-sample the subcarriers at the input of the LSTM layer in our proposed method in Section III.

III. PROPOSED DPA-LSTM-NN CHANNEL ESTIMATOR WITH SUBCARRIER SAMPLING
In this section, we propose a novel learning-based architecture for the receiver exploiting the vehicular channel characteristics. Using DPA, the proposed DPA-LSTM-NN scheme performs first a coarse channel estimation that is used as the input of an LSTM layer. Since the LSTM is a powerful tool to track the channel variation and learn the channel correlation in the time domain, we favored the use of the DPA method instead of more complex estimators, such as the spectral temporal averaging (STA) [3] or the time domain reliable test frequency domain interpolation (TRFI) [5]. The LSTM is then followed by a NN in order to mitigate the remaining noise from the hybrid estimator, refining the channel estimation. Such a combination of the DPA, LSTM and NN provides robustness with respect to the HPA distortions at the receiver. Furthermore, given the smooth variation of the channel response in the frequency domain observed in Section II-B, we exploit such characteristic in order to reduce the LSTM input size. It is worth noticing that the LSTM layer usually requires a high computational cost. Consequently, reducing the size of its input is of paramount importance to address such high complexity issue.

A. DPA INITIAL ESTIMATION
As illustrated in Figure 4, for a given subcarrier k ∈ K on the DPA method combines at its input the i-th received OFDM symbol (y i [k]) and the channel estimate of the previous symbol (ĥ DPA i−1 [k]). The first DPA estimate is obtained via LS method, so that so thatŷ eq i [k] is further demapped to the nearest constellation symbol to result in d i [k]. Finally, the DPA channel estimate is obtained asĥ Note that, in contrast to the LS estimation exhibiting significant degradation due to the time variation, the DPA enhances the performance by exploiting the correlation characteristics between adjacent symbols in the OFDM transmission.

B. LSTM LAYER
Although DPA improves the performance when compared to the LS estimator, a relevant performance loss is observed in communication scenarios with high mobility. In these cases, the demapping error increases since the correlation between symbols, explored by the DPA, decreases [24]. In order to deal with this issue, we design an LSTM layer after the DPA initial estimation. It is based on recurrent units to process and learn from a sequence of data [13]. This is done by internal gate units capable of storing the memory content of the data, while employing structures capable of deciding when to keep, or override, information of these memory cells. Therefore, such advanced processing characteristics of the LSTM make it able to learn the channel correlation over time and adapt the channel estimates accordingly. Figure 5 illustrates the classical LSTM unit used in our approach. Internally, there are three inputs per LSTM unit: l t , o t−1 and c t−1 , denoting respectively the input of the current time step t, the output of the hidden state and the memory  at the previous time step (t − 1). The operations with the inputs are illustrated by the activation function σ and the hyperbolic tangent tanh, following [13]. These operations define which information is overridden and which is kept memorized in the current cell state. As outputs, the LSTM unit produces c t , the memory cell state at the time step t, and the output o t . The loop continues until the end of the sequential information, so that o t of the last unit is the output of the LSTM network. In the context of channel estimation, a number U of LSTM inputs must be used, which is related to the number of active subcarriers. In addition, each LSTM network has P hidden states, dictating the number of steps t for recurrent operations.

C. SUBCARRIER SAMPLING
The small maximum delay spread of the considered channel leads to a weak frequency selectivity, i.e., h[k] ≈ h[k ± 1]. Therefore, given the set of nonlinear forward and feedback operations performed by the LSTM layer, it may be possible to exploit this local flat fading in the frequency domain and operate with a reduced subset of subcarriers, resulting in a reduced size of the LSTM layer. This will considerably decrease the computational complexity, at the cost of a slight degradation of the channel estimation performance.
Thus, we define a subset S ⊂ K on , so that only the DPA estimatesĥ DPA i [s], ∀s ∈ S, are selected as inputs of the LSTM layer. Moreover, we also define K p as the set containing the K p pilot subcarriers, while K d is the set of the K d data subcarriers, so that K on = K p ∪ K d . As an example, let us consider a slice of Figure 3 for an arbitrary symbol index, plotting the magnitude of the V2V-UC channel as a function of the subcarrier index. Figure 6a shows all active subcarriers for a given symbol index, with pilot subcarriers illustrated with dashed lines and data subcarriers are in solid lines. In this example, the scenario follows the IEEE 802.11p standard, where there are K on = 52 active subcarriers, out of which K p = 4 subcarriers are pilots and the remaining K d = 48 subcarriers carry the data.
Notice that the inclusion of the set K p in S is mandatory since it carries the OFDM pilots, so that K p ⊂ S. Therefore, we sample only among the subcarriers in K d . Figure 7b illustrates a 1/2 sampling rate, where the K p = 4 pilot subcarriers are included, while 24 out of the K d = 48 data subcarriers are chosen. The selected subcarriers are taken using a simple down-sampling pattern. In this manner, the size of the LSTM layer can be adjusted according to the cardinality of S, reducing complexity in the channel estimation.
Finally, it is worth noting that the input of the LSTM layer has size 2 |S|, while its output has size 2 |K on |, related to the real and imaginary parts from the complex-valued channel estimations. The interpolation to produce the channel estimates for all active subcarriers is intrinsically performed by the LSTM, by means of training.

D. NN POST-PROCESSING AND TRAINING
The output from the LSTM layer is then processed by a shallow NN with N 1 neurons to reduce the noise and provide VOLUME 10, 2022   [32] to define the parameters related to the training and testing stages of our method. The number of samples used for the training and testing phases is defined by splitting 10000 different realizations of the vehicular channel into sets with 80% and 20% of the total, respectively. The batch size is set to be sufficiently smaller than the size of the training dataset, thus speeding up its generalization and the training process, while the number of training epochs is set large enough to ensure the convergence of the model. For the optimizer, we favored the adaptive moment estimation (ADAM) with ReLU activation function to minimize the loss between the perfect channel and the estimates from the proposed DPA-LSTM-NN. This choice is motivated by its fast computing time, a small number of parameters to tune, and its well-known ability to solve optimization problems. Finally, as suggested in [32], the learning rate is set as 0.001, which is automatically adapted by the ADAM during its progress, until the method converges. Table 5 summarizes the DL architecture and parameters used in the training phase from our proposed scheme. Finally, Figure 7 presents the block diagram of the proposed DPA-LSTM-NN architecture.

IV. BENCHMARK CHANNEL ESTIMATION SCHEMES
In this section we briefly describe three state-of-the-art DLbased channel estimators that will be compared with our method. Specifically, we consider the DPA-DNN [7], the LSTM-NN-DPA [11] and the LSTM-DPA-TA [12] schemes. These designs have been chosen from Table 1 since they also combine DPA estimation with DL techniques for vehicular channels, with the last two also employing LSTM units.

A. DPA-DNN CHANNEL ESTIMATOR
The DPA-DNN scheme was proposed in [7] in order to improve the DPA method using an AE-DNN. Their receiver considers an initial DPA estimation that is followed by an offline trained AE with three hidden layers, consisting respectively of 40, 20 and 40 neurons. Figure 8 illustrates their approach, in which the goal of the DNN is to update the estimation initially obtained with the DPA, by learning to correct the estimation errors betweenĥ DPA i [k] and the perfect channel. The output is denoted byĥ DPA−DNN i [k], which is the DPA-DNN channel estimation.
The authors in [7] show that the trained DNN is capable of learning the channel frequency domain characteristics, preventing the error propagation typical of the DPA method. In addition, although only a V2V communication scenario free of the HPA nonlinear distortions has been considered in [7], we have shown in [19] that DNN-based methods implicitly have some robustness against these nonlinearities. This is different from the case of using only conventional channel estimators, without DNNs, for which the performance is considerably degraded by the HPA distortions. As our numerical results will show, the DPA-DNN also has interesting performance in the presence of the HPA nonlinearities, but still is outperformed by our proposed approach.

B. LSTM-NN-DPA CHANNEL ESTIMATOR
LSTM networks have been recently employed in the context of vehicular channel estimation. For example, the LSTM-NN-DPA scheme has been proposed in [11], which employs an LSTM network allied with a NN in order to reconstruct the channel as close as possible to the ideal channel response. The authors consider that the input of the LSTM receives the LS of the K p pilot subcarriers, in two consecutive The block diagram of the LSTM-NN-DPA scheme is shown in Figure 9, while numerical results in [11] show that this method is able to learn the time and frequency characteristics of the channel, tracking its variation and mitigating noise. Thus, significant performance improvement in comparison to previous DNN-based receivers has been achieved.

C. LSTM-DPA-TA CHANNEL ESTIMATOR
Another LSTM-based receiver has been proposed by [12], where the LSTM estimates are directly fed to the DPA method, producingĥ LSTM-DPA i [k] as an output. Then, noise mitigation is achieved by means of a TA scheme, defined aŝ (14) where α defines the fixed time averaging weight. Figure 10 illustrates the block diagram of the LSTM-DPA-TA scheme. Furthermore, this estimator exhibits a lower computational complexity when compared to LSTM-NN-DPA, while achieving similar performance in different mobility scenarios. Nevertheless, both LSTM-NN-DPA and LSTM-DPA-TA still require a large number of neurons to perform the operations in the LSTM units, since all active subcarriers are used.

V. SIMULATION RESULTS
In our simulations we use the IEEE 802.11p standard as basis, with a 10 MHz bandwidth and carrier frequency f c = 5.9 GHz. Each transmitted OFDM frame consists of L = 50 symbols. Moreover, a total of K = 64 subcarriers  are employed within each OFDM symbol, in which only K on = 52 are active, while the remainder K n = 12 subcarriers are used as a guard band (inactive). In addition, K p = 4 out of the K on subcarriers are allocated as pilots, while the remaining K d = 48 active subcarriers carry the data. We also assume perfect synchronization at the receiver, with constantly updated channel estimation. Table 6 summarizes the considered simulation parameters, including the IEEE 802.11p standard physical layer specifications, recalling that we denote K on , K p and K d as the set of K on , K p and K d subcarriers, respectively.
The performance evaluation of the proposed DPA-LSTM-NN scheme is done in terms of BER, NMSE and computational complexity, and compared with DPA-DNN [7], LSTM-NN-DPA [11] and LSTM-DPA-TA [12] schemes. Following [33], the training for all the estimators is performed at the highest expected SNR value, ξ = 30 dB, in order to reduce the impact of the noise and better learn the channel variations. In addition, in order to have a fair comparison between the solutions in terms of complexity, we considered Furthermore, we consider the V2V-UC vehicular channel model, with two vehicles moving in opposite directions at v = 48 km/h (low mobility scenario), v = 100 km/h (high mobility scenario) and v = 200 km/h (very high mobility scenario). We also considered 16-QAM and QPSK modulation techniques, aiming to cover both lower and higher modulation order aspects in the analysis, while the impact of the HPA nonlinearities has been considered for IBO = 4 dB for the higher modulation order and, since QPSK is considerably more robust with respect to the nonlinearities, we extend our analysis to higher effects of HPA-induced nonlinearities, employing IBO = 2 dB in this case.

A. BER AND NMSE PERFORMANCE
First, we investigate the impact of the subcarrier downsampling factor on the BER performance of the proposed DPA-LSTM-NN scheme. Figure 11 plots the BER as a function of the SNR of the DPA-LSTM-NN estimator for the low mobility scenario (v = 48 km/h), 16-QAM modulation with an IBO = 4 dB. Notice that we indicate the size of the LSTM unit and the number of neurons of the NN in the legend. For instance, (52-15) indicates an LSTM unit with size P = 52 hidden states and N 1 = 15 neurons. Then, we have considered different sets of sampled subcarriers with P = |S| ∈ {52, 36, 28, 20, 16}. Since the K p = 4 pilot subcarriers are always included in S, we illustrate the cases of sampling the data subcarriers with rates 1/1, 2/3, 1/2, 1/3 and 1/4, respectively. We observe that it is possible to reduce the input size of the LSTM U and the number of P hidden states considerably with a slight degradation in the BER performance. Consequently, the LSTM demonstrated to be capable to interpolate the information of the missing subcarriers even with P = 28. Therefore, in the sequel we only consider the DPA-LSTM-NN scheme with P = |S| = 28 hidden states and an LSTM input U = 2 |S| = 56. Figure 12 compares the BER performance of the estimation schemes using 16-QAM modulation and IBO = 4 dB. As illustrated in Figure 12 for the low mobility scenario, LSTM-NN-DPA [11] and LSTM-DPA-TA [12] perform better than our proposed scheme at low SNR. This is due to the demapping error of the DPA method, which increases in low SNR. Thus, since [11], [12] use the LSTM layer before the DPA, they achieve increased performance. However, when the SNR increases the DPA method provides a cleaner information to the LSTM layer, compared to LS used in [11] and [12]. Then, we observe that the DPA-LSTM-NN scheme outperforms all other benchmark methods when ξ ≥ 22 dB. Note also that such SNR level is crucial to achieve BER lower than 10 −3 , required by many practical applications. Furthermore, for high and very high mobility scenarios, respectively in Figures 12b and 12c, we observe a higher advantage for the proposed DPA-LSTM-NN estimator, outperforming the other solutions regardless of the SNR. It is also important to highlight that the proposed method is the sole estimator to achieve BER in the order of 10 −4 for high and very high mobility. In addition, considering a BER of 10 −3 , the proposed scheme has 4 dB of SNR gain compared to the LSTM-DPA-TA method in Figure 12b, and 2 dB of SNR gain compared to the LSTM-NN-DPA method in Figure 12c.
The performance improvement of the proposed estimator with respect to LSTM-NN-DPA and LSTM-DPA-TA is illustrated in Figure 13 in terms of the NMSE gap. We calculate the NMSE for fixed SNR ξ = 30 dB, 16-QAM modulation, IBO = 4 dB, for different velocities. Comparing DPA-LSTM-NN and LSTM-NN-DPA, we observe that the NMSE gap is always higher than 40% regardless of v. Comparing DPA-LSTM-NN and LSTM-DPA-TA the NMSE gap is always higher than 20%, increasing with v. This result shows that the proposed DPA-LSTM-NN performs better in minimizing the error between the perfect channel and its channel estimates in high SNR, being a better choice for tracking the channel in presence of nonlinear distortions.
In order to focus on the effects of the HPA-induced nonlinearities, the error rate performance is evaluated with IBO = 2 dB 1 in Figure 14. Low, high and very high mobility scenarios are considered with QPSK modulation. Similarly to the results considering 16-QAM modulation, we observe that the proposed DPA-LSTM-NN scheme outperforms other methods, except in the low mobility scenario at low SNR. Nevertheless, we can notice here that both LSTM-NN-DPA and LSTM-DPA-TA estimators present an error floor at high SNR. This is mainly due to the low IBO, since the LS estimation used as the input of the LSTM layers in [11] and [12] is highly degraded by the HPA nonlinear distortions.  In addition, the performance gap between the LSTM-NN-DPA, LSTM-DPA-TA and our proposed method increases with the SNR, since the DPA method provides more reliable channel estimates in this case. Figure 15 corroborates such analysis, by showing the NMSE gap between ours and the benchmark LSTM-based estimators in the same scenario of Figure 14. Similar conclusions as in Figure 13 can be obtained, with the DPA-LSTM-NN method outperforming other schemes by at least 53%. Interestingly, the gap is higher in low mobility scenarios, while it slightly decreases with v.

B. COMPUTATIONAL COMPLEXITY ANALYSIS
In order to compare the computational complexity of the schemes, we calculate the number of real-valued operations in terms of multiplications/divisions and summations/subtractions, required to estimate the channel from a received OFDM symbol.
The computational complexity of the DPA-DNN estimator has been detailed in [24]. The initial DPA estimation requires 18 K on multiplications/divisions and 8 K on summations/subtractions, while the total number of multiplications and summations of the DNN depends on the number of neurons at each layer. Following [24], the number of multiplications and summations of the DNN is given by The shallow NN, by its turn, has a single hidden layer, so that it computational complexity is given by while the computational complexity of the LSTM unit has been detailed in [12], which depends on the input size of the LSTM unit U and on the size of its hidden states P.
Following [12], the overall number of real-valued operations of the LSTM unit is given by The LSTM-NN-DPA estimator considers U = 2 (K on + K p ) = 112 inputs for the LSTM, where the multiplication by two takes both real and imaginary parts into account, and P = K on = 52 hidden states. In addition, the input size of the NN matches the size of the LSTM output, as well as its output, VOLUME 10, 2022   that is related to the number of subcarriers, so that N 0 = N 2 = 2 K on = 104. Also, N 1 = 15 has been considered for all schemes in this paper. Thus, combining the computational complexity of the LSTM, the NN and the DPA corresponds to 12 K on 2 + 81 K on + 8 K on K p multiplications/divisions and 89K on + 8 K p − 8 summations/subtractions.
In addition, the LSTM unit of the LSTM-DPA-TA scheme has U = 2 K on = 104 inputs and P = K on = 52 hidden states, while the TA technique requires 2 K on multiplications/divisions and 2 K on summations/subtractions.  followed by the LSTM unit with P = K on +K p 2 = 28 hidden states and U = K on + K p = 56 inputs, with an additional NN layer with N 0 = N 2 = 2 K on = 104 and N 1 = 15 neurons. We obtain, thus, the complexity as (19) and Table 7 summarizes the real-valued operations required by the channel estimation schemes, as a function of the number of active subcarriers. As we observe, the proposed DPA-LSTM-NN scheme has the smallest coefficients for the most significant factors associated to K on in the operations of multiplications and divisions, consisting in the most impactful in the complexity of the considered estimators. This is relevant in the case, e.g., of a different communication standard employing a different number of active and pilots subcarriers, so that our solution would still present a lower complexity compared to other LSTM-based solutions in the literature. In addition, Figure 16 illustrates the computational complexity of the schemes in the case of K on = 52 subcarriers and K p = 4 pilots. We observe that the proposed DPA-LSTM-NN estimator with subcarrier sampling has at least 49.9% less real-valued operations than other LSTM-based solutions, and 16.7% less real-valued operations than the DPA-DNN scheme, while also improving the BER at the same time.

C. PRACTICAL ASPECTS
An important remark to the practical usage of DNN-based estimators is that their performance depends closely on the training stage of the network. In terms of the robustness of the training, a few observations arise from our investigation.
First, we observe that there is a generalization aspect of the methods trained for higher modulation orders when applied to lower modulation orders. For instance, the QPSK modulation can be seen as a part of the 16-QAM modulation, so DNNs trained with 16-QAM work well when QPSK modulation is employed in the testing stage. In addition, DNNs trained for high velocity are able to achieve very good performance in lower velocities. For example, if a DNN trained for v = 200 km/h is used when v = 48 km/h, the results are very similar than if the DNN was trained with v = 48 km/h. However, the opposite is not valid and yields significant performance degradation.
Furthermore, it is quite useful for the DNN-based solution to be robust against changes in the channel model, opening opportunities for generalized learning architectures to estimate vehicular channels under different conditions. Throughout this paper, we considered the V2V-UC channel model, while other V2V channel models also exist [31]. One of the existing methods to generalize the solution is the Ensemble Modeling (EM) [34], which is able to combine different neural networks, e.g., each for a different vehicle velocity or power delay profile, to improve prediction in a general case. A recent approach has been performed by [35], where EM is used to combine individual LSTM models for a particular optimization problem. This process is done by combining distinct models built for specific datasets, in order to generate a generalized prediction, robust to parameter variations, using a match of the prediction of each of its components. Our choice for the EM solution here is motivated by the fact that no complexity is added to the operation of the DNN-based estimator. The EM technique only modifies the training stage of the DNN; thus, without any impact on the complexity analysis performed in Section V-B.
As an example, we have implemented EM in our scenario, where we train eight models with datasets deployed in both V2V-UC and V2V Same Direction With FIGURE 18. Radar chart comparing the robustness do nonlinearities, SNR for optimized performance, modulation order for optimized performance, complexity, and effect of high mobility of the channel estimation scheme. We compare the proposed DPA-LSTM-NN scheme with DPA-DNN [7], LSTM-NN-DPA [11] and LSTM-DPA-TA [12] schemes.
Wall (V2V-SDWW 2 ) [31], with different velocities v = {48, 100, 150, 200} km/h. Then, we assign equal weights to the models to obtain the EM in an average approach. This method integrates the different offline trained models building a single DNN, which combines the learning of the different training datasets. Figure 17 plots the BER as a function of the SNR for v = 100 km/h, QPSK modulation and IBO = 2 dB. Figure 17a considers the V2V-UC channel, while V2V-SDWW is considered in Figure 17b. In addition, we compare in each figure the proposed DPA-LSTM-NN estimator trained specifically for a given channel model and velocity and its EM version integrating models trained for v = {48, 100, 150, 200} km/h and both channel models (denoted as DPA-LSTM-NN EM). As we observe, the DPA-LSTM-NN trained for one channel model and tested in a different model exhibits a performance loss. On the other hand, the EM-based solution works very well, exhibiting a very similar performance to the case when DPA-LSTM-NN is trained and tested in the same channel model.

D. SUMMARY OF THE ANALYSIS
Finally, the diagram illustrated in Figure 18 summarizes the main analysis of the results, and highlights the most appropriate application scenarios for each of the receivers compared in this work. Here, we emphasize the advantages from our proposed DPA-LSTM-NN as an estimator with low complexity compared to the benchmark estimators, capable of dealing with both effects of mobility and nonlinearities of the HPA, mostly when it is possible to operate in high SNR regime.

VI. CONCLUSION
In this work, we proposed a novel LSTM-NN-based estimator, with complexity reduction in exploiting the doubly-selective channel with a nonlinear scenario deployed by the IEEE 802.11p standard for vehicular communications. The simulation results evidence that is possible to increase the error compensation when comparing our solution to other LSTM-based estimators from the literature, by considering the DPA method as input to the LSTM layer, showing that this strategy presents more correlation aspects to this nonlinear post-procedure, especially in high SNR scenarios. Also, in sampling the subcarrier information used in the training and reducing the size from the LSTM layer, we show that is possible to reduce the complexity of the DPA-LSTM-NN receiver, recording at least 49.9% less real-valued operations when compared to the recently proposed LSTM-NN-DPA and LSTM-DPA-TA schemes. We also explored an example of a generalized approach, which modifies the training state of the DNN so that the final solution covers different channel models and vehicle velocities, providing robustness and general learning architectures to the vehicular communication scenarios. As future works, we aim to extend our studies by proposing alternatives for the pilots limitation imposed by the IEEE 802.11p standard, in order to increase the performance gain in employing the LSTM solutions, mostly when considering high mobility aspects. In addition, we also highlight the opportunity to extend the generalized DNN approach, applying other methods to design more robust and general learning architectures to these vehicular scenarios.