A Hybrid LSTM-ResNet Deep Neural Network for Noise Reduction and Classification of V-Band Receiver Signals

Noise reduction is one of the most important process used for signal processing in communication systems. The signal-to-noise ratio (SNR) is a key parameter to consider for minimizing the bit error rate (BER). The inherent noise found in millimeter-wave systems is mainly a combination of white noise and phase noise. Increasing the SNR in wireless data transfer systems can lead to reliability and performance improvements. To address this issue, we propose to use a recurrent neural network (RNN) with a long short-term memory (LSTM) autoencoder architecture to achieve signal noise reduction. This design is based on a composite LSTM autoencoder with a single encoder layer and two decoder layers. A V-band receiver test bench is designed and fabricated to provide a high-speed wireless communication system. Constellation diagrams display the output signals measured for various random sequences of PSK and QAM modulated signals. The LSTM autoencoder is trained in real time using various noisy signals. The trained system is then used to reduce noise levels in the tested signals. The SNR of the designed receiver is of the order of 11.8dB, and it increases to 13.66dB using the three-level LSTM autoencoder. Consequently, the proposed algorithm reduces the bit error rate from 10−8 to 10−11. The performance of the proposed algorithm is comparable to other noise reduction strategies. Augmented denoised signals are fed into a ResNet-152 deep convolutional network to perform the final classification. The demodulation types are classified with an accuracy of 99.93%. This is confirmed by experimental measurements.


I. INTRODUCTION
Amplitude and phase noise reduction is important in all fields of signal processing, including RF and microwave communications, and data analysis. Phase noise comes from a multiplicative process widely used for generating millimeterwave signals. White noise is proportional to the signal band, and to the noise temperature. Both are important in millimeter-wave systems. They can make the extraction of the desired information from a signal more difficult.
The associate editor coordinating the review of this manuscript and approving it for publication was Qingli Li . This degrades remote sensing and data transfer in wireless communication systems. Most noise reduction algorithms applied to RF signals are based on a time-frequency representation of the input, and on digital denoising techniques, such as the short-time Fourier transform (STFT), the singular value decomposition (SVD), and the fast wavelet transform (FWT) [1].
A recent and effective approach is based on the wavelet technique. For example, Yu et al. [2] proposed to use complex wavelets for audio signal processing, to protect the phase of the signal. They developed two new denoising methods, a sophisticated thresholding process, and biased risk thresholding [3], [4]. Using the proposed procedure to find the threshold can however be challenging.
Many authors propose to use a filter for removing noise and for restoring the spatial resolution of a signal. For example, the low-pass filter excludes noise at a low level. The moving average filter takes an average of the signal. The finite impulse response (FIR) removes high-frequency components from the baseband signal [5]. These filters sometimes remove edge information in the denoising process. The signal-tonoise ratio (SNR) can then be improved by increasing the power in the carrier signal, but it is difficult to achieve for millimeter-wave frequencies.
The proposed approach is based on deep learning techniques, which gained popularity in recent years in the telecommunications industry, mainly to cancel noise distortion in receiver signals. They are powerful on-time methods that can be applied to both the phase noise and the amplitude noise. Autoencoders based on perceptrons or recurrent neural networks (RNNs) have been successfully used to extract the features of clean data from noisy signals in various applications. Many deep learning-based algorithms were proposed for image processing. Among those, the long short-term memory (LSTM) network has been successfully used to perform sequential learning. A deep LSTM denoising autoencoder network was used to enhance hybrid speech [6]. The denoising of transient electromagnetic data was also performed using an LSTM network autoencoder [7]. An LSTM convolutional neural network (CNN) was proposed to improve voice activity detection [8]. Table 1 lists related studies that are using similar deep learning algorithms for various applications. They mainly consists in deep learning techniques used for image classification or parameter estimation. The use of deep learning algorithms for telecommunications systems have still not been well exploited. Such a study [9] propose a CNN algorithm for modulation classification.
In this work, an LSTM-ResNet network was used for the first time for denoising and classifying receiver signals. The experimental results were obtained from a V-band six-port based receiver designed for millimeter-wave wireless communications. This six-port technology, which was developed for direct-conversion radio receivers, has been studied for many applications [10], [11]. It can improve the performance of a receiver, especially in terms of bit error rate (BER) [12].
For this work, the proposed methodology can be summarized as follows: • An LSTM network autoencoder algorithm is used to remove noise from demodulated PSK and QAM signals. The proposed deep learning model increases the signal to noise ratio by around 3 dB, and consequently decreases significantly the bit-error rate below 10 −11 .
• Denoised signals are fed into a deep residual network (ResNet). Three different ResNet models (ResNet-50, ResNet-101, and ResNet-152) are implemented to perform the demodulation classification. Accuracy, precision, recall, and F1 scores of various ResNet models are compared.
• The rotation augmentation technique is applied to artificially expand the size of the training dataset. Experimental results show that the combination of the augmentation technique, of ResNet-152, and of an LSTM network achieve an accuracy of 99.93%. The paper is organized as follows: the experimental setup of the proposed millimeter-wave transceiver system is described in Section 2. A description of the LSTM network to denoise voltage signals of the CNN to classify the denoised in-phase and quadrature (I/Q) signals, and experimental results, are given in Section 3. Section 4 concludes the paper. Figure 1 illustrates the block diagram of the transmitter and receiver for the proposed high-speed wireless communication system. The transmitter includes a 8360 series synthesized sweeper (HP8360), an Agilent's E4438C ESG vector signal generator, and a 20-dBi millimeter-wave SAC-2012-15-S2 conical horn antenna (ANT). An HP8360 C-series is used  to generate a local oscillator (LO) signal with a frequency of 30 GHz and a power of 10 dBm. Agilent's E4438C ESG vector signal generator baseband quadrature I/Q outputs are applied to sub-harmonic mixer inputs to generate high symbol rates random sequences of PSK and QAM symbols. The generated signals are fed to a conical horn antenna (ANT) to amplify and broadcast the modulated signals to the receiver antenna.

II. EXPERIMENTAL SETUP
The receiver parts include a similar 20-dBi conical horn antenna, a six-port down-converter, and an SBL-6039032550-1212-E1 low noise amplifier (LNA) with 25 dB gain and a noise figure (NF) of 5 dB.
The six-port circuit prototype and the related power detectors are fabricated on a thin ceramic substrate using a miniaturized hybrid microwave integrated circuit (MHMIC) technology [20]. The reference signal on the receiver side is generated using an Anritsu 68347C synthesized signal generator, a home-made frequency multiplier (FM) with an HMC578 GaAs active multiplier (×2), and an HMC1105 GaAs passive multiplier (×2). An attenuator operating over a 40 dB range, and a phase shifter, are also used to control the power level and the phase of the LO signal in the millimeter-wave band. The attenuator is adjusted to have a power of −25 dBm at reference port 5. The six-port circuit has three hybrid couplers with a 90 degree phase shift and a Wilkinson power divider [12], [21]. The six-port technology is used in the receiver part for various reasons: it provides a straightforward direct demodulation of quadrature demodulation schemes (M-QAM, PSK), a low-cost demodulator containing only passive circuits and four diodes, and good dynamic range. Only low power is required for the LO in the down conversion, which is crucial for mm-waves. These characteristics result in a low-cost interferometric receiver with good efficiency [22], [23]. The propagation path loss is large due to the operating frequency. The distance between the transmitter and receiver antennas is therefore set to one meter (the free space attenuation is 68 dB) so that the signals are detectable. Two lenses are also added in the setup to compensate for the low power of the transmitter. A dielectric lens with a gain of around 6 dB is on the left-hand side, and a planar meta-material lens [24] with a gain of about 10 dB is on the right-hand side. A six-port interferometer and a Schottky diode-based detector ensure well-defined relations between both the input and all four output signals [25], [26].
The output voltages are given by: where the constant K is measured in V/W. The four baseband voltages v 1 to v 4 from the millimeter-wave front-end receiver are connected to four identical video amplifiers for amplifying baseband voltages. A DSO80804B Agilent infiniium digital oscilloscope is then used to display and record the received demodulated signals. The I/Q can be computed by subtracting two baseband voltages and by reducing the DC offset value [25]: (2) Figure 3 illustrates demodulation results of a pseudo-random PSK and QAM bit sequence of 100 nanoseconds. Given the limitations of the oscilloscope, we saved 1 Ms/s (megasamples per second) and each symbol has a duration of 1 µs. We therefore have a packet size of 10 5 symbols in durations of 0.1 second. We can see that the demodulated signal shapes follow the input of the modulated signals generated by the transmitter. We observe that the odd and even index voltages are out of phase. The I/Q signal generation by a differential approach therefore increases their levels and decrease the inherent DC offset of the demodulator. The measured BER on the test bench, using a pseudo-random QPSK sequence, is less than 10 −8 , which corresponds to an SNR of about 11.8 dB. In the next section, a description of the LSTM network to denoise voltage signals, and of the CNN to classify the denoised in-phase and quadrature (I/Q) signals, is given in details.

A. RECURRENT NEURAL NETWORKS WITH LONG SHORT-TERM MEMORY
Deep learning methods have been widely used to extract valuable information in complex systems. Various architectures are available, depending on the characteristics of the input data. Recurrent neural networks (RNNs) can deal with the time dependent nature of the input data. They have been used in areas with sequential data, such as for text, audio, and video processing [27], [28]. Given the cyclic connections in the RNN architecture, current state updates depend on past states and on current input data. However, RNNs cannot deal with long minimal lags between relevant signals. Hence, long short-term memory (LSTM) networks can be used for introducing gate functions into the cell structure [29]. LSTM networks are deep learning models, which can be divided into two broad categories: LSTM-dominated networks, and integrated LSTM networks. A basic LSTM unit is made of a single hidden layer, with an average pooling layer and a logistic regression output layer. The hidden layer consists of standard recurrent cells, such as sigma cells or tanh cells. Signals are denoised using multilayer LSTM networks. The architecture of a multilayer LSTM network is illustrated in Figure 4. In this recurrent structure, useful informations are VOLUME 10, 2022 stored to minimize the loss function in each neuron, and the noise of the signals is released by the memory units of a forgetting gate. LSTM unit processing is based on the following equations, which give the update of the internal state c t , and the output vector h t : where for i = 1 to n do 4: Calculate i t , f t , and c t (eq 3)

5:
Update cell statec t (eq 3) This dependency on the previous data makes this method suitable for denoising purposes. Algorithm 1 describes the LSTM algorithm. It goes through multiple epochs until either the maximum number of iterations is reached, or when the cost function target is met.
There is no pure signal in the real world. One of the main challenge of signal processing is canceling the unsatisfied noise. There are many mathematical strategies to remove noise from a signal. However, these methods are used when Decoded 2 (repeat, LSTM, time distributed) 6: end function 7: Set Parameters (test/train data, encoder and decoder LSTM units, and optimizer and cost parameters) 8: Normalize the dataset (values from 0 to 1) 9: for epochs and batch size do 10: Predict results using LSTM model 11: Cost (cross entropy mean) 12: Optimizer (ADAM, learning rate 0.001) 13: end for the range of the signal noise is constant. The exact shape of the noise cannot be estimated, or it is unknown in many cases. The LSTM network autoencoder is used for these purposes. A deep LSTM network consists in a return sequence, a repeat vector, and time distributed functions. Figure 5 shows the data flow and the architecture of the LSTM network autoencoder with one encoder layer and two decoder layers, one for the reconstruction and one for the prediction. The input to the model is a sequence of 500 vectors with 10,000 samples. In what follows, Layer 1 reads the input data, and it outputs 500 features. The output of this layer is considered as an encoded feature vector. In a reconstruction layer of the decoder, the repeat vector replicates the encoded feature vector 500 times. Then, the next layer is designed to unfold the encoded feature vector. Hence, the encoder is used in the reverse order in the decoder layer. The final layer, Time Distributed (Dense(1)), is added at the end to give the output. In a prediction layer of the decoder, the repeat vector is set to 100 features. Figure 6 shows the denoised demodulation results of pseudo-random PSK and QAM using the LSTM network autoencoder. The SNR of the proposed receiver is around 11.8 dB and it increases to 13.66 dB using the threelevel LSTM method. Consequently, the proposed algorithm reduces the bit error rate from 10 −8 to 10 −11 . In Table 3, the SNR and BER of three different denoising techniques are compared. The wavelet (wdenoise function with the Symlet family of order 4) and low-pass Savitzky-Golay filters (sgolayfilt function) denoising methods are applied using the Matlab toolbox. As shown in Table 3, the wavelet denoising technique gives results comparable to the LSTM network autoencoder methods. However, the proposed method is a very simple technique and does not require defining a threshold, and it will protect all the important information in  the demodulated signals. In the next section, a conventional neural network algorithm will be used to classify various demodulation constellation signals using the denoised data outputs of the LSTM networks.

B. ResNet CONVOLUTIONAL NETWORKS
Deep convolutional neural networks brought notable improvements to image classification techniques. Deep networks can be enriched by adding a number of stack layers, and by integrating low, mid, and high-level features and classifiers [30], [31]. The number of layers (the depth of the neural network) is known to influence the accuracy of the classification [32], [33]. But increasing the depth of the network can also cause saturation. Accuracy may then degrade rapidly. Increasing the number of layers may lead to overfitting and increased training errors [34].
Deep networks often suffer from vanishing gradients. As the model back propagates, gradients get smaller, which can make learning difficult. To address this problem, the deep residual learning framework has been used in this research to VOLUME 10, 2022 help train deeper networks. The main innovation of ResNet is the skip connection. A skip connection diagram is labeled ''identity connection'' in Figure 7. The skipped connections are known as identity shortcut connections. This allows the network to learn the identity function, which can then pass the input through blocks without passing through the other weight layers. The desired mapping is denoted H (x). The zero and identity mappings are computed using F(x) = H (x) − x and F(x) + x respectively. It is assumed that the optimal performance can be achieved when the blocks are closer to the identity mapping rather than to the zero mapping, and it should also be easier to find perturbations using the reference to an identity mapping. This makes it possible to stack extra layers and to create a deeper network, to neutralize the missing gradient, and to allow the network to pass through layers that it feels are less important for training.
In this work, the input data includes 1200 signals belonging to 6 different classes related to demodulated I/Q signals. The 6 classes are defined as: 1) BPSK; 2) QPSK; 3) 8 PSK; 4) 16 QAM; 5) 16 PSK; 6) 32 QAM. The ResNet is applied to the I/Q signals that are collected using the wireless sensor, and they are denoised using LSTM network autoencoders. A data augmentation technique is applied to artificially expand the size of the training dataset. Augmentation techniques are powerful methods to reduce training errors due to overfitting. In this work, the rotation augmentation technique is performed by rotating the image one degree clockwise, and one degree counter-clockwise with respect to the y-axis.
80% of the data is used for the training, and 20% is used for the tests. To evaluate the effect of the number of layers, the ResNet with three layers (ResNet-50, ResNet-101, and ResNet-152, where the two or three-digit number gives the number of layers) were implemented. Accuracy, precision, recall, and F1 scores of various ResNet models are given in Table 3. The computing times for each epoch are also listed in the last column of Table 3. The results show that ResNet-152 offers the best performance in comparison with ResNet-50 and ResNet-101.
Curves of accuracy and loss, with respect to the number of epochs, are shown in Figure 8. We see that combining the  ResNet-152 and the LSTM network lead to a classification with an accuracy of 99.93%. The performance of the network is maintained while the number of layers is increased in the ResNet. This may be attributed to the fact that the identity mapping is made in the network. There are some layers on the current network that do not affect the network performance to avoid overfitting effects by increasing the depth of the network. Assessing the performance of the network, based on the input data size and on the processing time of the deep learning methods, is always a challenge. For example, in the current research, combining the CNN and the LSTM network results in a classification with an accuracy of 98.6%, and a processing time of 32 s/epoch for 1200 signals in the input dataset. There would be 6000 signals if we were using the augmentation technique, and rotating the image by five degrees to artificially expand the size of the training dataset. The ResNet method is therefore used to avoid the drawbacks of the overfitting, for a given input dataset size, which leads to higher classification accuracy and more processing time.
To show the performance of the ResNet-152 algorithm, the confusion matrix, or error matrix, is shown in Figure 9. This matrix reports the number of false positives, false negatives, true positives, and true negatives for all 6 demodulation classes. We find that the proposed demodulator with the lowest number of constellation points can be classified with high accuracy. For a small number of constellation points, there are no false positive or false negative. For a high number of constellation points, the number of false positives and false negatives is negligible.

IV. CONCLUSION
This paper propose a deep learning technique for noise reduction and signal classification in RF microwave communication systems. A V-band receiver based on a six-port technology is designed and fabricated using the miniaturized hybrid microwave integrated circuit (MHMIC) technology. Output voltages and constellation diagrams are measured for various random sequences of PSK and QAM modulated signals. A recurrent neural network (RNN) with a long shortterm memory (LSTM) network autoencoder architecture is proposed to achieve signal noise reduction. The LSTM network autoencoder is trained with various noisy signals in real time, and the trained system is used to reduce noise in the tested signals. The signal-to-noise ratio (SNR) of the proposed receiver is around 11.8 dB, and it increases to 13.66 dB using the LSTM network. Consequently, the proposed algorithm reduces the bit error rate from 10 −8 to 10 −11 . Denoised signals were fed into ResNet convolution networks to perform the final classification. ResNet-152 classified constellation diagrams with an accuracy of 99.93%. He was also a Visiting Professor at UERJ, Brazil, and KAUST, Saudi Arabia. He is teaching applied mathematics to engineering students. His current research interests include the development of new finite-element based numerical methodologies for modeling fluid mechanics problems found in free surface and turbulent flows, for modeling electromagnetism phenomena found in supraconductors and for the optimization of antennas, and for the study of magnetohydrodynamics.