Timing Synchronization Based on Supervised Learning of Spectrogram for OFDM Systems

This paper proposes a supervised convolutional neural network (CNN) based symbol timing synchronization method using the spectrogram image for preamble-less orthogonal frequency division multiplexing (OFDM) systems. With the development of mobile terminals, OFDM has become an increasingly widespread fundamental technology for wireless communications. While OFDM can achieve high-speed transmission, it is sensitive to synchronization timing for decoding. Thus, the accurate synchronization timing estimation method has become essential for reliable communication. Conventional synchronization timing estimation methods without the preamble lack investigations of estimation accuracy under varying environments, comprehensive performance evaluation, and robustness to Doppler shift. Focusing on the spectrum fluctuations observed when synchronization errors occur, our proposed approach is to train the CNN using spectrogram images to find accurate synchronization points even in noisy and fluctuating environments. The simulation results show that the proposed method achieves better synchronization accuracy than other existing methods. Furthermore, it shows the near-optimal bit error rate (BER) characteristics and superior processing time for BER in an environment close to realistic settings, such as broader synchronization timing and various Doppler shifts.

Timing Synchronization Based on Supervised Learning of Spectrogram for OFDM Systems Shun Kojima , Member, IEEE, Yuta Goto, Kazuki Maruta , Senior Member, IEEE, Shinya Sugiura , Senior Member, IEEE, and Chang Jun Ahn, Senior Member, IEEE Abstract-This paper proposes a supervised convolutional neural network (CNN) based symbol timing synchronization method using the spectrogram image for preamble-less orthogonal frequency division multiplexing (OFDM) systems.With the development of mobile terminals, OFDM has become an increasingly widespread fundamental technology for wireless communications.While OFDM can achieve high-speed transmission, it is sensitive to synchronization timing for decoding.Thus, the accurate synchronization timing estimation method has become essential for reliable communication.Conventional synchronization timing estimation methods without the preamble lack investigations of estimation accuracy under varying environments, comprehensive performance evaluation, and robustness to Doppler shift.Focusing on the spectrum fluctuations observed when synchronization errors occur, our proposed approach is to train the CNN using spectrogram images to find accurate synchronization points even in noisy and fluctuating environments.The simulation results show that the proposed method achieves better synchronization accuracy than other existing methods.Furthermore, it shows the near-optimal bit error rate (BER) characteristics and superior processing time for BER in an environment close to realistic settings, such as broader synchronization timing and various Doppler shifts.Index Terms-Symbol timing synchronization, convolutional neural network, spectrogram, preamble-less OFDM system.

I. INTRODUCTION
T HE DEMAND for wireless communications continues to increase with the widespread use of smartphones and tablets, as well as the growing demand for 8K and live streaming.Next-generation wireless communications technology, such as beyond 5G or 6G, needs to satisfy high-capacity, massive connectivity, ultra-reliability, and low latency.Orthogonal frequency division multiplexing (OFDM) is still known to be the major technology to achieve these requirements.OFDM is a parallel multi-carrier transmission method that has been applied in many fields, such as digital terrestrial television broadcasting system (DTTB), wireless local area network (LAN), ultra-wide band (UWB), third-generation partnership project (3GPP), and hence it is one of the most important technologies in wireless communications [1], [2].The key feature of OFDM systems is to require accurate symbol synchronization to ensure orthogonality among subcarriers after the fast Fourier transform (FFT) operation.Synchronization error breaks such orthogonality, which causes inter-symbol interference (ISI) as well as inter-carrier interference (ICI).
To tackle these problems, preambles, a known sequence at the transceiver, are essential for symbol timing synchronization [3], [4], [5], [6].Its fundamentals are established and widely implemented in various kinds of OFDMbased wireless communication standards.The authors in [3] proposed the synchronization method by applying the expectation-maximization algorithm using an OFDM preamble to reduce the computational complexity.Paper [4] presents the preamble-aided timing estimation method based on restricted cross-correlation and timing adjustment.Work in [5] employed a chirp signal which developed its great potential under a multipath fading channel environment.The authors in [6] introduced a pair of optimal m sequences inserted into the frequency domain to convey signaling and facilitate synchronization.
Although the preamble-based synchronization is mainstream of current implementation, it cannot contribute to information transfer; the overhead depresses efficiency.Wireless communication faces the realm that even such a slightest overhead must be reduced for the true realization of enhanced capacity, ultra-low latency, and ultra-low power consumption.Preamble-less synchronization methods have also been studied as a solution to this drawback.The most common approach utilizes the auto-correlation with the cyclic prefix (CP), which is the copy of the last part in the OFDM symbol.However, degradation of the accuracy of synchronization timing estimation has been a problem and a major challenge for these methods due to their weak tolerance to the impacts of additive noise.
To realize the precise estimation of synchronization timing, we previously proposed to apply a convolutional neural network (CNN) using one-dimensional power spectrum images as a preamble-less estimation method [7].If FFT windowing is incorrect, breaking orthogonality can be observed at the guard band in the spectrum image as signal power leakage due to ISI.Pre-training this feature enables to derive an appropriate FFT windowing point without using autocorrelation processing.Although this method has shown good results, especially when the signal-to-noise ratio (SNR) is high, many issues remain, including low overall estimation accuracy and only ideal synchronization scenarios being considered.In addition, since the spectrum is a one-dimensional image, the feature extraction capability of the CNN is not fully utilized, causing a problem of low accuracy for the amount of computation.Therefore, to further improve the OFDM timing synchronization accuracy and consider more practical environments, this paper proposes an enhanced approach that utilizes the spectrogram image of the received signal for CNN.Spectrogram represents the time transition of the power spectrum, and it could contain more informative features for training.In the proposed method, we utilize this feature and apply CNN, which has strong feature extraction capability in images, to enhance the estimation accuracy.Simulation results reveal the effectiveness of the proposed method in more realistic environments, such as severe timing offset and higher Doppler frequency.

A. Related Work
Preamble-less synchronization has been widely studied in order to address the problem of preamble-based methods and to improve the processing speed [8], [9], [10], [11], [12], [13].Synchronization methods developed in [8] and [9] calculates the correlation value between two sliding blocks of the same length as the CP.Still, they are susceptible to multipath fading and are not stable.In [10], the authors proposed an extension of the Gini-Giannakis estimator for single-carrier systems, which relies on second-order statistics only and exploits the cyclostationarity of the OFDM signal.This method exhibits a small sensitivity to stationary noise even without preambles.Literature [11] proposed minimum mean-squared error (MMSE) estimators exploiting the conjugate-symmetry property exhibited by the OFDM signal with real data symbols.It significantly outperforms CP timing estimators in additive white Gaussian noise (AWGN) channels.Moreover, the effectiveness of a modified symbol-timing estimator is clarified, which can assure satisfactory performance in multipath fading channels.
In [12], the authors presented an estimated weighting factor and two symbol timing estimators for CP-OFDM systems without prior knowledge of signal and noise powers.Paper [13] introduced a blind symbol timing estimation method with a constant modulus constellation.It can accurately estimate the symbol timing with only one OFDM symbol, so it has a higher tolerance to time-varying channels.Since these preceding approaches require no preambles, there is room for improvement in the accuracy, computational complexity, and tolerance to additive noise effect as well as fast-moving environments.Modified ones that can synchronize even in high-speed and multipath environments have been studied [14], [15], but these approaches do not provide sufficient synchronization accuracy, especially in lower SNR regions.
In recent years, machine learning technology has been applied in a variety of fields.Its application to wireless communications has also been particularly active and has yielded remarkable achievements [16], [17], [18], [19], [20], [21].Major achievements are: modulation recognition using machine learning [16], [17], channel estimation method by deep learning [18], [19], and communication environment estimation for adaptive modulation and coding [20], [21].Also, timing synchronization methods using machine learning are being investigated [22], [23], [24], [25], [26].In [22], the authors proposed the time of arrival and carrier frequency offset estimation method using a preamble by a deep neural network (DNN) in a narrowband IoT (NB-IoT) multiuser environment.The authors [23] presented the symbol timing offset estimation by using the CNN-DNN model (CDM).[24] proposed a DNN-based receiver to outperform the existing approaches with or without timing synchronization error.Applying the DNN structure, the method realizes the high robustness for the timing synchronization error and ISI.In [25], the authors presented a one-dimensional CNNbased approach for packet detection and synchronization.Introducing deep learning with a simple structure achieves low computational complexity while maintaining high detection performance.The authors in [26] addressed a blind synchronization problem under the variable frame length scenario.To solve the problem, they proposed a deep learning-aided blind synchronization timing estimation.Highly accurate estimation is attained by applying CNN with regression as their output.While the state-of-the-art studies show impressive performance, they suffer from problems such as estimation accuracy and the lack of comprehensive communication performance evaluation.

B. Contribution of This Paper
This paper proposes the synchronization timing estimation method from the spectrogram image of the received signal using CNN.The main contributions of this paper are: 1) To achieve higher synchronization accuracy by employing CNN with supervised learning of spectrogram images.Spectrograms contain a complex mixture of multiple pieces of information necessary for accurate synchronization timing detection: power distortion, leakage, and frequency variations.CNN trains such a phenomenon to detect the exact synchronization timing, even without taking auto-correlation via CP.2) To analyze the effect of timing offsets appearing in the spectrogram of the received signal and to evaluate the proposed synchronization method in a realistic environment.In [7], the evaluation had been done only when the Doppler frequency was low, and the synchronization timing was ideal.Since these evaluations lack appropriate evaluations for real-world environments, this paper solidifies the effectiveness of the proposed method by evaluating the case with higher Doppler frequencies and the case with broader synchronization timing.3) To clarify the viability of the proposed method from the viewpoint of computational complexity.In this paper, the effectiveness of the proposed method is demonstrated through a comparative evaluation with several conventional synchronization algorithms in terms of processing time and bit error rate (BER).To the best of our knowledge, no methods have been proposed to improve the accuracy of synchronization timing estimation by using spectrograms as input for machine learning.For the first time, this paper presents a theoretical analysis of a method for estimating synchronization timing using CNN by converting received signals into spectrogram images and shows that the proposed method improves the practical BER performance by determining the accurate synchronization timing.
The rest of this paper is organized as follows.Section II describes the system model.Section III reviews representative preamble-less timing synchronization methods.Section IV presents the proposed method using CNN trained by spectrogram images.Section V shows its simulation results compared to conventional methods.Finally, Section VI provides concluding remarks.

A. Channel Model
This paper assumes an OFDM-based transmission and reception under the time-varying multi-path fading channel [21], [27], [28].Fig. 1 shows its block diagram, including the proposed method.The system model in this study uses a pseudo-random binary sequence generator as a data generation and assumes fixed CP removal.In wideband transmission, multi-path fading occurs when the difference in each propagation path length is sufficiently long compared to the wavelength.It causes a drop in the received signal level in the specific spectral components called frequency selective fading.Its channel impulse response h(τ, t) is expressed as where h l and L indicate the complex channel gain of the l-th path and the number of discrete paths.δ(•) and τ l denote the Dirac's delta function and delay amount for individual paths, respectively.T ms represents the multipath spread, .denotes the floor function, and W is the transmission signal bandwidth which can be expressed as where N and f sub indicate the number of subcarriers and the subcarrier spacing, respectively.Each channel tap can be represented by where Here, L represents the number of arriving rays.a l and f d denote the l -th path amplitude and the maximum Doppler frequency.f c , η, and ι indicate the carrier frequency, user terminal speed, and the speed of light, respectively.θ l represents the angle of arrival of the l -th incoming wave, and φ l is its initial phase.
is the ensemble-average operation.The time-varying channel is assumed in this paper based on Jakes model [29].Here, the channel transfer function H(f, t) can be represented by the Fourier transform of h(τ, t) and expressed as where From observing these equations, it can be seen that for L > 1, the channel's spectrum response H(f, t) is not flat, which means that the channel has frequency selectivity.

B. OFDM and Effect of Synchronization Errors
Consider a discrete-time signal containing N subcarriers in the OFDM system, the bit sequence is mapped into complex symbols in the frequency domain represented as X (k )(k = 0, . . ., N-1), where k denotes the subcarrier index.These symbols are modulated onto the N orthogonal subcarriers and converted to the time-domain signal, x(n), via inverse fast Fourier transform (IFFT).The transmission signal can be described as where N CP indicates the number of cyclic prefix symbols.At the receiver side, there are carrier frequency offset and synchronization timing offset caused by the discrepancy between the transmitters and receivers' oscillator and the Doppler shift of the channels.These effects have a significant impact on communication performance.Thus, we should also consider these influences and the received signal can be written by where f o denotes the carrier frequency offset, τ l is the samplespaced path delay, and φ indicates the phase factor of an arbitrary carrier.z(n) is the complex white Gaussian noise process with zero mean and the variance of σ 2 n .Here, when is the timing offset and τ max is the maximum delay spread, if τ max is within the cyclic prefix N CP , the orthogonality between subcarriers is guaranteed and the effect of the timing offset appear only as phase rotation.The m-th received subcarrier R m is represented as follows where X m , H m , and n m is the m-th transmit subcarrier, m-th channel frequency response, and Gaussian noise, respectively.In this case, the effect of phase rotation due to timing offset can be compensated by channel equalization.On the other hand, when the delay spread exceeds the cyclic prefix, the orthogonality between subcarriers will collapse, and interference will occur between symbols and subcarriers, causing ISI and ICI.The received subcarrier can be rewritten as where I m represents the ISI-induced ICI.η( ) denotes the interference term due to the ISI and is expressed as follows We can model each of these terms as a Gaussian noise with the following power for each subcarrier unit signal power [30].
If satisfies > 0, Δ l is defined as, On the other hand, in the case of < −N CP + τ max , These equations suggest that the cyclic prefix should be set sufficiently longer than the delay spread.The synchronization timing offset must be estimated accurately to realize highquality OFDM communication.Otherwise, the FFT window will be misaligned, it is unable to retrieve orthogonalized frequency-domain signals.

III. CONVENTIONAL PREAMBLE-LESS TIMING ESTIMATION METHODS
This section reviews representative conventional methods.These approaches determine the FFT windowing point by probing the relationship between the CP and the endpoint of the symbol.

A. Maximum Correlation (MC)
The preamble-less synchronization method in [8] simply observes the autocorrelation function between the first and last parts of the OFDM symbol.It is represented as follows: where r(n) are the time-domain received signal samples and n is its sample index of the most recent input.(.) * indicates to take the conjugate.Then, the synchronization point is defined by exploring a maximum value of the auto-correlation function, which can be expressed by MC realizes the synchronization timing estimation with low computational complexity, however, its accuracy is limited in the multipath channel environment.

B. Maximum Likelihood (ML)
ML estimator is presented in [9].Its estimation accuracy is superior to MC, sacrificing computational cost.The synchronization point by the ML estimator is represented as Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply. where Here, σ 2 s and σ 2 n denote the signal power and noise power, respectively.Comparing ( 17) and ( 18), it can be confirmed that normalization is considered in the second term in (18), which is responsible for the superb estimation accuracy.

C. Timing Estimation Based on Redundancy of CP (RCP)
Reference [14] proposed the synchronization timing estimation focusing on the redundancy in CP.It uses two steps to obtain the ideal synchronization point: coarse and fine synchronizations.At first, the metric function described in ( 18) is utilized to seek the coarse synchronization point n max .Then, for more precise refinement of the synchronization point, the positioning function is implemented especially at the end of the CP region.The function can be expressed as where p satisfies {p = 0, 1, . . ., N CP − 1}.From (20), we obtain the maximum value υ max as follows.
According to the results obtained from ( 18) and ( 21), the optimal synchronization timing point can be expressed as

D. Short Block Based ML (SBML)
Literature [15] employed the correlation block shorter than CP.In general, the block length for auto-correlation calculation should be sufficiently long for precise estimation.However, the multi-path effect causing ISI prevents its auto-correlation procedure.This work introduced the shorter block length, N SBML < N CP , in order to compromise between the accuracy of correlation detection and the multi-path effect.Its estimation process can be formulated as Optimizing N SBML can maximize the synchronizing accuracy while suppressing the computation complexity.

A. Spectrogram
The proposed method utilizes spectrogram images for OFDM timing synchronization.Fig. 2 shows the influence of synchronization point misalignment in spectrogram images.Here, the part to which the shift of n points is added is R in Fig. 1.Details of the parameters related to this spectrogram are described in Section V. A spectrogram is one of the received signal waveforms representing the time, frequency, and power by transforming short-time Fourier transform (STFT).STFT applies a window function to perform the Fourier transform while shifting it and represents temporal variations of the phase and power of the received signal in the frequency domain.Here, elements of the window function are zero outside a specific interval.STFT output of the received signal R after the application of window function f w , is expressed as follows, where t represents the time sample shift.This study employs the rectangular window expressed as From ( 9), synchronization accuracy of traditional approaches is strongly limited by the carrier frequency offset and Doppler shift as well as the additive noise effect.On the other hand, the proposed method focuses on the power leakage into the guard band, where is the null subcarrier region not used for data transmission.It is caused by ICI if FFT windowing is misaligned, as expressed by ( 11)- (15).

TABLE I CLASSIFICATION ACCURACY WITH VARIOUS CNN PARAMETERS
As shown in Fig. 2, the exact synchronization point can be identified by observing the signal leakage at the null subcarrier region, i.e., the guard band, which is intended to avoid ICI.Moreover, ICI impact can also be reflected in the data subcarrier part.Such features can be trained via CNN to classify the synchronization timing as an output, even without auto-correlation calculation using CP.This is the key idea in our proposal.

B. Convolutional Neural Network
Currently, machine learning techniques are attracting a lot of attention.Particularly in the task of image classification, CNNs have shown remarkable performance.CNN enables classification with high accuracy by elaborately extracting the features contained in the input image through convolutional processing.Therefore, it has a high potential for wireless communication environment recognition through the received signal waveform.The fundamental structure of the CNN is represented in Fig. 3. Network parameters such as filter size and the number of layers are determined based on prior work [21].It consists of the input layer, convolutional layer, pooling layer, fully connected layer, and output layer.The transfer function of CNN is expressed as where Y j and X i indicate the output of j-th neuron and input of i-th neuron, respectively.W ij and B j denote the weight of i-th neuron to j-th neuron and the bias of j-th neuron.F represents the activation function and the rectified linear unit (ReLU) function is adopted except for the output layer in the proposed method.It can be expressed as follows, The activation function of the output layer is the softmax function, which is defined as, where V is the number of output neurons.
Table I shows the classification accuracy results for timing synchronization for various CNN parameters.Here it is shown how correctly the 30 patterns of synchronization timing were classified with respect to the spectrogram when Eb/No is 30 dB.From the table, the number of CNN layers that perform best is 9, and the filter size is 3 × 3. The proposed method adopts this configuration, as shown in Fig. 3.At the beginning of the CNN, eight convolutional kernels and a 3 × 3 filter are Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.utilized.After the second convolutional layer, kernels of sizes 16, 32, 48, 64, 96, 128, 256, and 512 are applied with the size of 3 × 3, respectively.There are 512 neurons in the first fully connected layer and 30 neurons in the second layer.In terms of the pooling layer, max-pooling is adopted, and the filter size is 2 × 2.
Fig. 4 shows the classification accuracy and loss in the training process of the proposed CNN.The proposed CNN introduces max-pooling and dropout layers to avoid underfitting and over-fitting.As shown from this figure, the proposed method is stable with a small number of iterations in terms of both classification accuracy and loss and exhibits rapid convergence.In addition, there is almost no difference between the training and validation results, which confirms that underfitting and over-fitting have been inhibited.

C. Synchronization Timing Estimation by CNN From Spectrogram Images
Many conventional synchronization methods in preambleless OFDM systems exploit the similarity of CPs to find the synchronization point by calculating the auto-correlation value.Our approach unnecessitates such correlation calculation; it applies CNNs with supervised training from spectrogram images.Fig. 5 illustrates the model of received symbols in a multipath fading channel.It corresponds to the part of the received signal R in Fig. 1.It shows the situation around the synchronization point, and the area is divided into according to the amount of delayed signal included.In the case at A 2 , appropriate synchronization can be attained thanks to the nature of the CP.However, in the case of synchronization regions A 1 or A 3 (or outside of them), ISI occurs due to the previous symbol (at A 1 ) or the next symbol (at A 3 ).This results in disturbances in the received spectrum after FFT, especially in the guard band.
Fig. 6 exemplifies the difference in the received signal when the FFT window is applied within and outside of A 2 .This figure shows the difference between the cases where the synchronization was appropriate or not appears in the received spectrum in a visible form.We previously proposed the timing synchronization method using a neural network from the received spectrum [7].Although this method achieves quick timing estimation, there are still issues in terms of estimation accuracy.Therefore, in this paper, we aim to improve the accuracy by using spectrograms, which have more information than a one-dimensional spectrum.With its superior feature extraction, CNN is employed to extract the maximum amount of features related to timing synchronization.A concept diagram of the proposed method is shown in Fig. 3.
As explained, we can use CNN to determine if synchronization is appropriate based on the spectrum fluctuation.However, it is not possible to determine where the appropriate synchronization point is, although it is possible to check whether the synchronization is appropriate at any point.Therefore, we employ spectrogram images as shown in Figs. 2. This is a continuous representation of the spectrum obtained through the specified observation period.The strength of spectral density is expressed as color.This extension is suitable from the CNN In Figs.2(c), (d), the right half of the spectrogram is disrupted due to ISI; signal leakage can be seen in the guard band.The left half part is relatively unperturbed, indicating little or no ISI effect.From this observation, the boundary between A2 and A3 is located near the center of the spectrogram, and it is the most appropriate point to start synchronization.In this paper, we convert the estimation problem to the classification problem; the output of the network is the classification probability of how much it deviates from the optimal synchronization point.Optimal synchronization timing can be achieved with low computational complexity because the softmax function in the final layer rounds to the nearest class of synchronization timing.After training the CNN by the spectrogram with the predetermined synchronization point, it will be able to identify the best synchronization starting point by classifying the spectrogram even for unknown signals.Algorithm 1 shows the detailed procedure of the proposed method.Note that the optimal synchronization point must be in the spectrogram, so its observation range should be wide enough.

A. Simulation Parameters and Training
Monte Carlo simulations have been conducted to evaluate the performance of the proposed symbol synchronization method.Table II summarizes the simulation parameters for the To prepare for synchronization, it is necessary to train the CNN.In this study, the CNN is trained using spectrograms with the ratio of energy per bit to the spectral noise density (Eb/No) of 0, 5, 10, 15, 20, 25, and 30 dB under a 15-path Rayleigh fading environment.In the case of using spectrograms, it has been shown that the feature extraction effect of CNN on unknown Eb/No is sufficient as long as Eb/No is 5 dB intervals [21].The constitution of the proposed CNN is shown in Fig. 3.A spectrogram is a straightforward image that expresses the power density of a received signal in the time and frequency domains.As an advantage of the proposed method, the spectrogram allows a CNN with a simple configuration to extract enough synchronization features to perform proper classification.Thus, to fully utilize this advantage, the proposed CNN consists of a gradual increase in the number of channels in the convolution layer.In configurations where the number of channels increases exponentially, the loss of features due to the pooling process increases, and the synchronization accuracy deteriorates.Since the proposed method attributes timing synchronization estimation to a classification problem, its effectiveness cannot be properly evaluated by mean square error.Therefore, in this paper, we demonstrate the effectiveness of the proposed method by comparing the actual synchronization error rate (SER) and BER performance.
The input size of the spectrogram image used for CNN training is 875 × 656 × 3 for the sake of computational resources.4,000 simulation trials were performed under each Eb/No value.Since there are 30 synchronization points in the observation section, we trained CNN as a classification problem of 30 classes; 480,000 spectrogram data were used in total, which contains different channel and noise origination patterns.The training network is designed using MATLAB Deep Learning Toolbox.The overall simulation is conducted in Windows 10 64GB, Intel Core i9-10900 CPU @ 3.70GHz, and NVIDIA Geforce RTX3090.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

B. Simulation Results
At first, the impact of FFT size on the proposed method is shown.Table III represents the classification accuracy in various FFT sizes.In this paper, the input for CNN is fixed to an image size of 875 × 656.Therefore, while the computational complexity does not change as the FFT size increases, the classification accuracy degrades when the FFT size exceeds 1024 due to the quantization error in imaging.The proposed method can achieve some accuracy even when the FFT size is increased.It suggests that the proposed method can handle various FFT sizes by enlarging the input size to the CNN.This section aims to confirm the effectiveness of the proposed method.For simplicity, we assume that the FFT size is fixed to 64.
Table IV shows the classification accuracy results when different modulation schemes are input to the network trained only with QPSK signals.This table shows that no matter which modulation scheme is used as input, the accuracy is almost the same as that obtained in the original QPSK case.Fig. 7 compares classification accuracy from the viewpoint of the signal representation format.The received signal waveform formats compared here are the spectrogram, the time-domain waveform, the power spectrum, and the IQ constellation.From this figure, it can be seen that the spectrogram shows the highest classification accuracy in all Eb/No regions.It is because the spectrogram has a large amount of information regarding timing synchronization, as shown in Figs. 2. Its 3-dimensional representation has a high affinity with CNN to extract such features.
In this simulation, Maximum Correlation (MC) [8], Maximum Likelihood (ML) [9], the timing estimator based on the redundancy of CP (RCP) [14], Short Block based ML (SBML) [15], 1-D CNN [22], and CDM [23] are cited as the benchmark.The method in [22] is a simultaneous estimation method of channel coefficients, CFO, and timing synchronization by 1-D CNN using preambles.This study uses the timing synchronization estimation part as a comparison.Fig. 8 presents the SER versus Eb/No at a Doppler frequency of 0.01 Hz.The calculation method for the synchronization error rate is based on counting as a single error when the estimated synchronization timing does not match the set synchronization timing A 1 to A 3 in Fig. 5.In Fig. 8, the correct synchronization point is set as the only last point of A 2 , which is the optimal synchronization point in Fig. 5.In other words, if the classified output is not identical to the last point of A 2 , it will be counted as one mistake.
In the case where such strict timing estimation is required, conventional MC and ML almost result in erroneous judgments, leading to low error rate performance.MC is a method that uses an autocorrelation function for estimation, and ML further considers the additive noise effect on MC.These methods are greatly affected by ISI and ICI.As a result, synchronization errors unacceptably occur, and it causes error floors regardless of the Eb/No value.RCP and SBML show superior error rates, especially at a high Eb/No, because the peak values required for timing estimation can be accurately obtained thanks to the reduced influence of the noise term.
The methods [22] and [23] result in many synchronization errors.This is due to the fact that [22] and [23] use regression as their output, which causes rounding errors in error judgment and complicates strict timing estimation.It can be seen that the performance of the proposed method is the best of all other algorithms in the higher Eb/No region.In particular, at Eb/No of 30 dB, the error rate is 7%, indicating that optimal synchronization can be achieved with high probability.This proves the superiority of the proposed method, which detects the spectrogram disturbance by CNN compared with the auto-correlation-based existing approaches.
In Fig. 9, the timing accuracy is evaluated, including the ICI and ISI-free period of A 2 at a Doppler frequency of 0.01 Hz.In other words, synchronizing outside of A 2 is counted as a mistake.For packet-based OFDM transmission, the probability of synchronization can be improved if we extract a few points earlier than the estimated synchronization point to avoid ISI impact [31].MC and ML are particularly affected by this influence, showing better performance than other conventional methods in the low Eb/No region.The performance of RCP and SBML is reversed compared to Fig. 8. Since SBML introduces a short block length to reduce the computational complexity, it is more sensitive to additive noise impact.[22] and [23] show superior SER performance, especially in the low Eb/No region.These methods utilize a one-dimensional CNN with the originally received data as its input.Therefore, no quantization errors are involved in transforming the signal waveform, and they are superior in feature extraction in the low SNR region.While it shows better SER performance in such noise-dominated regions, its advantage is limited relative to the other methods when the effect of noise is small.In contrast, the proposed method shows superior performance in higher Eb/No regions.Utilizing spectrograms containing more detailed information can maximize the feature extraction capability of CNNs.Fig. 10 shows the BER performance, including the synchronization result, at a Doppler frequency of 0.01 Hz.Here, the timing synchronization is performed by each method to compensate for the effect of STO.The theoretical characteristic here indicates the case where all synchronizations are performed optimally.As can be seen from the figure, MC and ML show little improvement in BER with respect to the change in Eb/No as shown in Figs. 8 and 9.In addition, RCP and SBML have caused an error floor without achieving a BER of 10 −2 .This is due to the fact that the effects of ICI and ISI become more severe, especially as Eb/No increases.[22] and [23] are close to the ideal value in the low Eb/No region, same as Fig. 9, but it draws an error floor when Eb/No is high.Meanwhile, the proposed method shows superior performance in the high Eb/No region without the error floor.The proposed method estimates the synchronization timing by the power distortion in the spectrogram image, which is hardly affected by noise, ISI, and ICI.Fig. 11 shows the BER performance of synchronization timing compensation with perfect equalization at a Doppler frequency of 0.01 Hz.The evaluation assumes that the phase offset due to STO can be ideally equalized in the frequency domain in A 2 , the demodulation interval without ICI and ISI.Here, the effect of ISI and ICI is idealized in the timing compensation, so the accuracy of timing synchronization estimation is essential.As can be seen from this figure, all BER performances are improved because of precise synchronization compensation.However, the conventional method has a low timing synchronization accuracy, limiting the BER; the difference from the theoretical value becomes large.Although [22] and [23] show superior performance, especially at low Eb/Nos, due to their use of received signal directly, it has the disadvantage of low-performance improvement gain with increasing Eb/No, due to their low estimation accuracy.On the other hand, the proposed method achieves near-optimal BER performance approaching the theoretical value irrespective of Eb/No condition, and such a difference in the proposed method can be retained slightly.
Figs. 12 and 13 show the BER performance with Doppler frequency varied at 200 Hz with respect to Figs. 10 and 11, respectively.As the Doppler frequency increases, the impact of the time-varying channel degrades the theoretical BER performance.These figures show that MC and ML exhibit almost no BER degradation at a higher Doppler frequency.This is because MC and ML scan the correlation coefficients of the received signal in a whole scanning manner to find the optimal synchronization point, which is hardly affected by the Doppler shift.Modified algorithms, RCP and SBML, consider channel correlations and additive noise effects; therefore, a higher Doppler shift degrades their BER performances.From Fig. 12, it can be seen that [22] and the proposed method have almost no performance degradation due to the effect of Doppler shift.This is because [22] uses the absolute value of the received signal as the input of the 1-D CNN, while the proposed method uses the spectrogram, which is a power waveform, and both methods are less affected by phase fluctuations.On the other hand, despite the similar configuration of [23] to [22], the performance of [23] is significantly degraded due to the effect of the Doppler shift.This is because the CDM used in [23] is more complex than in [22].Thus, deeply reflective of the data set used for training, it cannot correctly follow the dynamic Doppler shift changes.As seen by Fig. 13, in the case of ideal timing synchronization compensation, the proposed method shows the closest to theoretical values in all Eb/No regions.If the CNN is trained to include the effects of the Doppler shift, it can be expected to improve the synchronization accuracy.Nevertheless, it was shown that once the network is trained, the performance of symbol timing synchronization can be steadily improved regardless of the effect of the Doppler shift.Fig. 14 shows the BER performance of the proposed method at various Doppler frequencies.In this case, the proposed method trains only with a Doppler frequency of 0.01 Hz.From Fig. 14, it can be confirmed that the proposed method can maintain BER performance close to the theoretical value even for different Doppler frequencies.This is because the Doppler frequency effect appearing in the spectrogram image used in the proposed method is in the form of fluctuations in symbol power and does not interfere with feature extraction, which is necessary for timing synchronization.
Fig. 15 shows the BER performance of the proposed method under various multipath fading channels.Here, the network is trained only in a 15-path fading environment, and A1, A2, and A3 are set considering the worst-case situation (where the delay wave occupies the entire length of the cyclic prefix).As shown in the figure, it can be confirmed that the proposed method shows excellent BER performance even in environments where there are various paths that are not used for training.The performance degradation in the 20-path case is attributed to the fact that the CP length is exceeded.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.In this paper, filtering and windowing are not considered here, so time-domain discontinuities between OFDM symbols exist.As pointed out, although the time-windowed OFDM can suppress out-of-band emission [32], the proposed timing synchronization method is based on the ICI effects that appear in the effective data subcarriers in the spectrogram not only the out-of-band, as shown in Fig. 2. Therefore it is expected to be still effective.For future work, to take such effects into account, we will investigate them using spectrogram images of received signals acquired through experiments on actual equipment.

C. Computation Complexity
Table V compares the computational complexity in terms of complexity order, processing speed, and training time to various synchronization methods.N s is the number of symbols, N CP is the CP length, and N is the number of subcarriers.Also, N l is the number of CNN layers, l is the index of CNN layer, x n−1 is the number of input channels in the lth layer, w n is the convolution filter, s n is the height, and m n is the size of output feature map.Here, the processing speed is the average of the actual calculation speed of 10,000 trials each in the environment used for this simulation, and the  same conditions of Fig. 11 are considered: the synchronization region is set as the period of A2.As seen in Table V, the proposed method, [22], and [23] using deep learning has an enormous increase in computational complexity order compared to other conventional methods.However, the proposed method shows the best BER performance, while the processing speed of timing synchronization is faster than that of [23].
Fig. 16 shows the comparison of processing speed with the various number of subcarriers.In Fig. 16, for a fair comparison, the processing speed by CPU is calculated for all methods.Therefore, the processing speeds of [22], [23], and the proposed method using CNN are slower than those of the conventional methods.All conventional methods have increased processing speed as the number of subcarriers increases.On the other hand, the proposed method uses spectrogram images of constant size as input, which confirms that the processing speed has almost no increase.Generally, computational complexity and processing speed are major concerns when applying deep learning.The proposed method does not require the calculation of correlation coefficients for timing synchronization and does not require complex number BER performance of the proposed method at various Doppler frequencies.multiplication.Therefore, the proposed method is easy to implement, and the computational complexity is not a problem due to the rapid development of computing devices such as GPUs.
The above results validate the effectiveness of the proposed method, which provides robust and accurate timing synchronization capability for various channel environment variations without preambles.It can contribute to highly efficient and reliable wireless communications.

VI. CONCLUSION
This paper proposed a CNN-based symbol timing synchronization method for preamble-less OFDM systems under severe multipath fading channels.Incorrect synchronization causes ISI and ICI, which also produces a disturbance in the spectrum.We focused on this phenomenon and applied supervised CNN to classify them by observing the spectrogram, which has power density information in the time and frequency domains.Computer simulations show that the proposed method can provide more accurate synchronization Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Comparison of processing speed with the various number of subcarriers.
point estimates than any other existing methods.Its superiority is also proven in terms of BER performance at low and high Doppler frequencies.With the proposed method, the lower bound is almost the same as the theoretical value while minimizing the increase in the actual processing time.

Fig. 1 .
Fig. 1.The overview of the proposed system.

Fig. 2 .
Fig. 2. The influence of synchronization point misalignment in spectrogram images.n indicates the sample index.Except for a few samples before the optimal synchronization point, it can be confirmed an increase of power in the guard band, i.e., out-of-band emission, due to the incorrect FFT windowing.

Fig. 3 .
Fig. 3.The CNN structure of the proposed method.

Fig. 4 .
Fig. 4. Accuracy and loss in the training process of the proposed CNN.

Fig. 5 .
Fig. 5.The model of the receiver symbols around the synchronization point in the multipath environment.

Fig. 6 .
Fig. 6.The difference in the received signal when synchronized within and outside of A2.

Fig. 7 .
Fig. 7. Comparison of classification accuracy with various received signal waveforms.

Fig. 8 .
Fig. 8. Synchronization error rate with Eb/No.The correct synchronization point is set as only the last point of A 2 .

Fig. 9 .Fig. 10 .
Fig. 9. Synchronization error rate with Eb/No.The correct synchronization region is set as the interval of A 2 .

Fig. 11 .
Fig. 11.BER performance of synchronization timing compensation with perfect equalization at a Doppler frequency of 0.01 Hz.

Fig. 12 .
Fig. 12. BER performance of synchronization timing compensation by various methods at a Doppler frequency of 200 Hz.

Fig. 13 .
Fig. 13.BER performance of synchronization timing compensation with perfect equalization at a Doppler frequency of 200 Hz.

Fig. 14 .
Fig. 14.BER performance of the proposed method at various Doppler frequencies.

Fig. 15 .
Fig. 15.BER performance of the proposed method under various fading channels.

TABLE V TIME
COST OF SYNCHRONIZATION BY VARIOUS METHODS