Estimation of Doubly-Dispersive Channels in Linearly Precoded Multicarrier Systems Using Smoothness Regularization

In this paper, we propose a novel channel estimation scheme for pulse-shaped multicarrier systems using smoothness regularization for ultra-reliable low-latency communication (URLLC). It can be applied to any multicarrier system with or without linear precoding to estimate challenging doubly-dispersive channels. A recently proposed modulation scheme using orthogonal precoding is orthogonal time-frequency and space modulation (OTFS). In OTFS, pilot and data symbols are placed in delay-Doppler (DD) domain and are jointly precoded to the time-frequency (TF) domain. On the one hand, such orthogonal precoding increases the achievable channel estimation accuracy and enables high TF diversity at the receiver. On the other hand, it introduces leakage effects which requires extensive leakage suppression when the piloting is jointly precoded with the data. To avoid this, we propose to precode the data symbols only, place pilot symbols without precoding into the TF domain, and estimate the channel coefficients by interpolating smooth functions from the pilot samples. Furthermore, we present a piloting scheme enabling a smooth control of the number and position of the pilot symbols. Our numerical results suggest that the proposed scheme provides accurate channel estimation with reduced signaling overhead compared to standard estimators using Wiener filtering in the discrete DD domain.


I. INTRODUCTION
Future mobile multicarrier systems have to meet a large variety of requirements.They are driven by increasingly demanding applications.Especially, the connectivity of high mobility devices such as automated vehicles poses a challenge.Automated vehicles have very strict requirements regarding the quality of the communication which is commonly referred to as quality of service (QoS) [1].In particular, ultra-reliable low-latency communication (URLLC) plays an important role in this context [2].It is essential for automated vehicles that sufficient QoS parameters, such as latency and data rate, are reliably provided and this even in high mobility scenarios.In these scenarios, the wireless channel is considered to be doubly-dispersive, i.e., varying in both time and frequency.In addition, efficiency plays an essential role due to limitations of the available spectrum as it is already foreseen that the 5th generation wireless system (5G) cannot fulfill future spectrum needs [3].For this reason, it is important to aim at improved efficiency during the development of future mobile multicarrier systems.It does not suffice to focus exclusively on improvements at higher layers; the physical layer must also be addressed.For example, it is desirable to reduce signaling overhead, e.g., the number of pilot signals, and to increase the reliability of the multicarrier system to avoid packet retransmissions.In this paper, we focus on physical layer enhancements by proposing a novel channel estimation scheme and utilizing linear precoding.
To address those challenges, we need to improve the transceiver structure of multicarrier systems taking pulseshaping filters into account.Nowadays, orthogonal frequencydivision multiplexing modulation (OFDM) is broadly used, e.g., in the 4th generation wireless system (4G), 5G, and wireless local area network (WiFi).OFDM uses rectangular pulses at the transmitter and receiver filterbank.With this setup, time-invariant channels reduce to convolution operators which are easily manageable, but OFDM suffers significant performance losses, when the channel is time-variant [4], [5].In this context, orthogonal time-frequency and space modulation (OTFS) has been introduced by Hadani et.al. [6].It uses the discrete symplectic Fourier transform (DSFT) as orthogonal precoding transform to precode symbols over the entire timefrequency (TF) domain.This approach is very distinct as data and pilot symbols are both placed in the delay-Doppler (DD) domain and are jointly orthogonal precoded [7].Several studies show that OTFS significantly outperforms OFDM in terms of bit error rate (BER) performance [8]- [11].This is due to the fact that the joint orthogonal precoding enables high TF diversity.In particular the achievable channel estimation accuracy is increased, since a pilot symbol placed in the DD domain probes each TF coefficient [12], [13].However, it also comes with some disadvantages.Firstly, channel estimation suffers under leakage effects when it is done in the discrete DD domain [14].Secondly, resource allocation becomes less flexible regarding multiuser aspects [15].Thirdly, the overhead for piloting in the uplink grows proportionally to the number of users [16].This motivates the approach followed in this paper, which is to apply precoding to the data but not the pilot symbols.Although we loose some TF diversity this way, we gain the flexibility to choose any precoding for the data symbols without affecting the piloting scheme.In [17], it is shown that aside from the DSFT any other orthogonal precoding, i.e., 2D orthogonal transform, yields the same high TF diversity, e.g., the low-complexity 2D fast Walsh-Hadamard transform (2D-FWHT).
In particular, the estimation of doubly-dispersive channels is a very important aspect for future multicarrier systems especially when TF symbols are precoded.Since the provision of an accurate channel state information (CSI) and the usage of an appropriate equalizer is essential to enable high TF diversity gains.Vehicular channels are considered to be doubly-dispersive, underspread, and often also to be sparse in the continuous DD domain following the wide-sense stationary uncorrelated scattering (WSSUS) model [18].A channel is underspread if all delay shifts and Doppler shifts are contained within a small region, i.e., both are relatively small.The channel is sparse when only a few point-like scatterers in the continuous DD domain exist.For pulse-shaped multicarrier filterbanks, the inherent sparsity of the channel cannot be harnessed using any form of discrete Fourier transform (DFT) for channel estimation in the discrete DD domain [14], [19], [20].A common way to estimate the channel is to get the leastsquares (LS) estimator from the pilot samples and to smooth them by means of the Wiener filtering in the discrete DD domain, which is commonly referred to as linear minimum mean square error (LMMSE) estimator or Markov estimator [21], [22].This approach however suffers under leakage effects [20], [23].Leakage effects are caused by the presence of both fractional Doppler shifts and fractional delay shifts which are not consistent with the discrete nature of the DSFT [14].
To cope with leakage effects and to promote sparsity, more complex estimation schemes are commonly followed.In this scope, compressed sensing or even super resolution are possible schemes, see for example [23], [24], respectively.In [25], Rasheed et al. propose a compressed sensing based algorithm using orthogonal matching and modified subspace pursuit to estimate the time-varying channels.A framework for sparse Bayesian learning with Laplace priors and a new piloting scheme has been introduced by Zhao et al. in [26], where they consider fractional Doppler shifts but not fractional delay shifts.An off-grid sparse signal recovery to estimate the original channel rather than the effective discrete channel in the DD domain is proposed in [27].In [28], an iterative optimization method is presented by Liu et.al., where a message passing signal recovery algorithm is utilized for channel estimation which takes fractional Doppler shifts but not fractional delay shifts into account.The listed schemes are rather complex, require high computing power, and consider longer time intervals, e.g., are computed adaptively over multiple frames, which does not suite well to URLLC in the context of rapidly changing vehicular channels, as it is known that the WSSUS assumption only holds for a limited duration and bandwidth [29].This makes channel estimation challenging and requires channel estimation on a per frame basis [10].Computationally complex and iterative optimization methods are therefore not considered in the presented paper.
Focusing on low-complexity estimators for URLLC, a common choice is the estimation of the channel main diagonal (CMD) on a per frame basis.This can be done by using an LMMSE estimator which however suffers from leakage effects.In this paper, we propose a novel CMD estimator in the TF domain in contrast to the estimation in the DD domain used for OTFS and DFT based schemes for OFDM.We place pilot symbols in the TF domain to enable higher flexibility and reduced overhead for pilot signaling.However, the pilot and data symbols still need to be properly arranged within a rectangular frame.To apply fast orthogonal precoding transformations, we typically require the input dimension to be to the power of two which equals the number of data symbols.Therefore, the placement of the pilot symbols is not obvious.To control the number and position of the pilot symbols, we propose an algorithm and a so called accordion pilot placement to place pilots in between the precoded symbols in the TF domain.The main contributions of this paper can be summarized as follows: • We study pulse-shaped multicarrier systems with linear precoding for URLLC over doubly-dispersive channels, • we numerically compare different linear precoding transformations, • we propose a novel smoothness optimized estimation scheme of the CMD coefficients which minimizes the energy of the discrete Hessian and takes the ratio between the delay spread and Doppler spread, the self-interference power, and receiver noise into account, and • we introduce a pilot placement scheme, i.e., accordion pilot placement, which enables a smooth control of the number and position of the pilot symbols.

A. Paper Organization
In Section II, the Gabor signaling and doubly-dispersive channel model is introduced.Linear precoding transforms and their diversity gain are discussed in Section III.In Section IV, we detail channel estimation, leakage effects, equalization, data recovery, and the proposed channel estimation scheme.The accordion pilot placement is presented in Section V.In Section VI, we show our numerical results.Finally, we summarize our conclusions in Section VII.

B. Notational Remarks
Random variable vectors, 2D-arrays and matrices are denoted with bold letters.Superscripts (•) * and (•) H denote the complex conjugate and the Hermitian transpose, respectively.Let * denote the non-cyclic 2D convolution which only returns the valid part.The column-wise vectorization operator, the absolute value, the euclidean norm, and the Frobenius norm is denoted as vec{•}, |•|, • 2 , and • F , respectively.We denote δ(•) as the Dirac distribution, as the Hadamard product, E{•} as expectation operation, and j 2 = −1.We denote the indices of down-converted received signal by (•).

II. SYSTEM MODEL
In this section, we introduce the system model which includes the doubly-dispersive channel and the input-output mapping of the information resources.We use a timecontinuous Gabor (Weyl-Heisenberg) signaling to derive a discrete system model for the pulse-shaped multicarrier scheme.We define the Gabor grid Λ = F Z M × T Z N with frequency step size F > 0 and time step size T > 0.The indices I = Z M × Z N run over the cyclic groups Z M = Z/M Z (integers of modulo M ) and Z N = Z/N Z (integers of modulo N ) taking in total M frequency steps and N time steps into account.The overall frame duration T f and bandwidth B are given by the products T N and F M , respectively.Regular Gabor grids can be categorized into three types depending on their TF product T F : Oversampling if T F < 1, critical sampling for T F = 1, and undersampling if T F > 1.
Let us denote the complex-valued pulse-shaping filters for synthesis and analysis as γ(t) and g(t), respectively.We design the pulses to be biorthogonal to obtain a perfect reconstruction in the absence of noise and channel distortions, i.e., At the receiver the orthogonality is typically lost due to channel dispersion which in turn causes self-interference [4], [10], [30]- [34].At the transmitter, the Gabor filterbank uses the synthesis pulse γ(t) to synthesize the transmit signal, i.e., where x ={x m,n } (m,n)∈I is the 2D-array of the TF symbols containing data and pilot symbols.The data symbols are modulated and encoded sequences of letters from a given alphabet generated by an information source.In contrast to the data symbols, the pilot symbols are known at the receiver and are coming from a different alphabet.
The doubly-dispersive channel model in the continuous DD domain with a total of R multipaths can be expressed as where the index set J = {1, . . ., R} associated with each path corresponds, respectively, to the delay shifts τ r , the Doppler shifts ν r , and the complex-valued attenuation factors η r .The assumption of the channel being underspread implies that all tuples (τ r , ν r ) are contained within a small region referred to as spreading region where τ max and ν max correspond to the largest delay spread and largest Doppler spread, respectively [18].In the time domain, the channel in (3) acts on the transmit signal in (2) as a time-varying convolution.Hence the received signal yields The receiver analyzes the signal using another Gabor filterbank.We assume it uses the same Gabor grid as the transmitter and can only differ in the choice of the analysis pulse g(t).
Then, we can describe the measured 2D-array of the TF symbols y ={y m,n } ( m,n)∈I by where y(τ, ν) ={y m,n (τ, ν)} ( m,n)∈I is the 2D-array of the receiver response to a single unit amplitude scatterer where τ is the delay shift, ν is the Doppler shift, and w ={w m,n } ( m,n)∈I is the 2D-array of the noise.In our system model, we assume that the measured noise samples w m,n are uncorrelated zeromean random variables with variance σ 2 > 0. The unit receiver response in ( 5) further evaluates to where φ(τ, ν) ={φ (m,n),( m,n) (τ, ν)} (m,n),( m,n)∈I is the effective channel matrix corresponding to a single unit amplitude scatterer.It can be written as where ∆n = n − n and ∆m = m − m for convenience.
Observe that the integral in (7) corresponds to the cross ambiguity function of γ and g which we define as The 2D-array of CMD coefficients h(τ, ν) ={h m,n (τ, ν)} ( m,n)∈I with respect to a single unit scatterer for ∆m = 0 and ∆n = 0 is given as Due to the assumption of an underspread channel, the diagonal elements of the effective channel matrix are dominant [31]- [34].This motivates the use of CMD estimation which is significantly less complex than maximum-likelihood estimation or iterative interference cancellation methods.Exact orthogonality of the pulses in the integral of (7) would imply that the effective channel matrix reduces to a diagonal matrix.
This, however, cannot be achieved in pulse-shaped multicarrier systems independently of (τ, ν) due to the intrinsic limitations of the cross ambiguity function [10], [13].To cope with this, we separate the off-diagonal terms in (7) which cause the observed self-interference due to both inter-carrier and intersymbol interference.More specifically, we define the selfinterference associated with a single unit amplitude scatterer as Then, we can write ( 6) with ( 9) and (10) as Finally, we can write the input-output relation in ( 5) with ( 11) as where h = {h m,n } ( m,n)∈I and z = {z m,n } ( m,n)∈I is the 2Darray of the CMD and the 2D-array of self-interference, respectively.We assume that the long-term expectation of the power over the normalized and zero-mean TF symbols gives E{|x m,n | 2 } = 1.Therefore, we model the distribution of z m,n as other random variables with uncorrelated zero mean noise and with (unknown) variance σ 2 z > 0, which does not depend on m and n [13], [34].

III. LINEAR PRECODING AND TF DIVERSITY
We can significantly improve the performance of the multicarrier system by using a so-called linear precoding also referred to as spreading.We point out that our model can be easily extended to include redundancy (i.e., number of rows is greater than number of columns of the precoding matrix) but, for the sake of simplicity, we restrict ourselves to orthogonal transforms.Precoding and decoding are applied to the TF symbols prior to Gabor synthesis and after Gabor analysis, respectively.Generally, we refer to any energypreserving linear mapping from information symbols X to TF symbols x as linear precoding and its inversion as linear decoding, accordingly.The key idea behind precoding is to intermingle information symbols such that each TF symbol contains information on all information symbols, which turns out to enable high TF diversity at the receiver [17], [35].The precoding distributes equalization errors and self-interference evenly across all information symbols, so that per symbol modulation works more reliable.This is important since -for example -the equalization error becomes locally large near zero-crossings of the CMD coefficients.In turn, the BER is significantly reduced.OTFS is a notable example that applies jointly orthogonal precoding to both data and pilot symbols.In OTFS, all symbols X = {X ,k } ( ,k)∈I • are placed in the DD domain and then transformed into the TF domain by applying the 2D discrete symplectic Fourier transform (2D-DSFT), i.e., where we use The 2D-DSFT in ( 14) is its own inverse as a result of opposite exponential sign, the flipping of the axes, and normalization; hence, orthogonal precoding and orthogonal decoding are the same operation. 1To some extent, the choice of the 2D-DSFT for orthogonal precoding is motivated by ( 6) and ( 13), which show that are the samples of a low-frequency 2D trigonometric polynomial which corresponds to Dirac pulses in the continuous DD domain.Many OTFS channel estimation schemes aim at making use of this fact, and it has been topic of many research to harness the sparsity of the channel [36], [37].Since our proposed channel estimation scheme only precodes data symbols, the orthogonal basis function is independent of the proposed piloting scheme and the choice of it is arbitrary as long as maximum TF diversity is achieved.

IV. CHANNEL ESTIMATION AND EQUALIZATION
In this section, we discuss the estimation of doublydispersive channels and leakage effects.We present the proposed channel estimator using smoothness optimization and detail its design choice as well as pilot signaling, equalization, and data recovery.

A. Leakage effects
In pulse-shaped multicarrier systems, the sparsity of the channel diminishes after applying discrete Fourier transforms to the received symbols after the Gabor analysis filterbank.
To see this, we compute the 2D-array of the CMD for a (τ, ν)-scatterer in DD domain by applying the 2D-DSFT to h m,n (τ, ν) in ( 9) as [14]  where D K corresponds to the Dirichlet kernel for an integer K > 0 which is defined to be and M F τ are integers.This is exactly not the case when fractional delay shifts and fractional Doppler shifts are present, thereby causing the observed leakage effects.Moreover, discrete Fourier transforms assume that their input samples stem from a bandlimited and periodic function, which is not true in our setup.The observed leakage of an individual scatterer follows the shape of the poorly localized Dirichlet kernel for the most part, whereas the design of the cross ambiguity function has a comparably marginal impact on it, see also [14].This degrades the performance of the channel estimation significantly unless more expensive leakage suppression techniques are applied, cf.[14], [38].

B. General piloting scheme
Our goal is to estimate the CMD coefficients h from the pilot symbols of the received frame y in (13).Motivated by the 2D orthogonal precoding via the 2D-DSFT, we let the data symbols originate from a 2D data frame indexed by a Gabor grid I = Z M × Z N with M ≤ M and N ≤ N .Then, the symbols from the data frame are multiplexed with the pilots into the TF frame.Therefore, we define the index set of the data symbols in the TF frame as D ⊂ I satisfying #D = #I = M N .For the indices of the pilot symbols, we take the complement set P = I \ D and put P = #P as the number of pilots.We choose arbitrary bijective maps κ d : D → I and κ p : P → {1, . . ., P } which describe how the data and pilot symbols are mapped onto the transmitted and received TF frame.We illustrate this mapping in Fig. 1.
Given a fully precoded 2D-array x = {x m ,n } (m ,n )∈I of data symbols and a vector of pilot symbols p ∈ C P , we define the content of the TF frame by the following multiplexing: The ordering of the elements x does not impact the achievable TF diversity gain when using orthogonal precoding transformations [17].For this reason the choice of κ d and κ p does not impact the performance, whereas the size of D and P does.
At the receiver, we can then extract the distorted pilot vector q ∈ C P ×1 from the received frame in (13) by Then, we can estimate the channel from ( 19) by different schemes as described in the following subsections.

C. Standard LMMSE estimator
The LMMSE estimator, which is a DFT-based estimator, follows a regularized LS scheme as discussed in [21], [39], [40].This scheme assumes that most of the energy of CMD coefficients is concentrated near the origin in the DD domain.The least-squares reconstruction is then performed on a subset of the DD domain which we refer to as reconstruction grid.We reduce the degrees of freedom by enforcing the estimated CMD to be zero outside the reconstruction grid, which we define as where Q and W specify the reconstruction grid for the expected shifts in Doppler domain and delay domain, respectively.They need to be selected such that Q > ν max T N and W > τ max F M .However, due to the poor resolution of the Dirichlet kernel in ( 16), fractional shifts are smeared over the DD grid.As a consequence, the reconstruction grid has to be expanded.In the case of smeared Doppler shifts, we just increase Q since they are generally distributed symmetrically to the origin.In contrast, delay shifts are distributed asymmetrically.Therefore, we introduce the parameter W n to consider smeared delays close to the origin.We start with the initial partial CMD estimate h pilot ∈ C P given as h pilot s = q s /p s s = 1, . . ., P.
To complete the CMD estimate, we search for the best LS fit among all CMDs which are supported on the reconstruction grid.For this, we define a sub-matrix C ∈ C P ×2Q(W +Wn) of the 2D-DSFT to link the reconstruction grid to the pilot symbols in TF domain as where ( m, n) ∈ P and ( ¯ , k) ∈ K.With this in hand, we can formulate the optimization problem as where H = { H¯ , k} ( ¯ , k)∈K is the 2D-array of the estimated CMD.The optimization problem in (23), has a closed form solution which is given by where Finally, we transform the estimated CMD coefficients of (24) to the TF domain by applying a 2D-DSFT, i.e.,

D. Proposed smoothness regularized channel estimator
We propose to estimate the CMD coefficients in the TF domain to avoid leakage observed in the discrete DD domain.Our scheme estimates the channel by interpolating smooth functions from the received pilot symbols.This is achieved by a novel regularizer which minimizes the energy of the second order derivatives.To justify this, we point out that in (15) the channel in the continuous DD domain consists of samples which are 2D trigonometric polynomials and low-frequency meaning that they are relatively slow changing compared to the frame size.In general, it is known that the second order derivative is a measure for the smoothness of functions.To smooth such functions, it is a common approach to minimize the second order derivative of the samples.The proposed channel estimation scheme follows this approach.
We compute the second order discrete derivatives using noncyclic convolutions with kernels of the size 3 × 3. Specifically in our setup, the 2D convolution of an array of size M × N , i.e., we consider the valid part of the convolution.We define the kernels as where Φ ff , Φ tt and Φ tf correspond to the second order partial derivatives with respect to frequency, time and mixed dimensions, respectively.With this in hand, we define the discrete weighted Hessian with m = 0, . . ., M − 1, n = 0, . . ., N − 1 as where the scaling parameters α, β > 0 assist in compensating channel modes which we detail in Section IV-E.The proposed channel estimator provides a solution to the optimization problem given as where δ is a relaxation parameter and h ex is an array containing the CMD estimate with appropriate padding to compensate for the size reduction from the convolution.The optimization problem in ( 30) is a convex constrained LS problem which can effectively be solved by standard methods [41].The actual CMD estimate h is then obtained by truncating h ex at the frame boundaries, i.e.,

E. Awareness of the channel mode
The scaling factors α and β in (30) control the preferred channel mode for the reconstruction, defined below.Let us briefly explain the intuition behind the weighting.The 2Darray of the CMD coefficients is in fact a sampling of an underlying differentiable function h(f, t) as shown in (15).In essence, the mode of a channel is given by the ratio of its 2Dsupport in the DD domain.Suppose h(f, t) is approximately supported on the rectangular box [−β, β] × [−α, α] in DD domain.Writing h(αf, βt) = u(f, t), we have that u in DD domain is supported on the unit square [−1, 1] × [−1, 1] and its mode is balanced between delay domain and Doppler domain.In (30), it is beneficial to regularize on the Hessian of u rather than h as the regularizing term does not favor any particular direction.By standard calculus, we know that the (continuous) Hessian matrix of u at the point (f, t) ∈ R 2 is given by As we only have access to the (discrete) Hessian of h, we include additional scaling into the optimization manually, obtaining the weighted discrete Hessian matrix as in (29).In summary, given that the doubly-dispersive channel in (3) has maximum delay spread τ max and maximum Doppler spread ν max , a reasonable choice is to put α = ν max and β = τ max .However, depending on other factors, for example, if the contribution of many scatterers is negligible, the parameters should be adjusted accordingly.

F. Noise-awareness
We relaxed the data fidelity term in (30) to mitigate noisy measurements.The relaxation parameter δ needs to match the expected error given as Considering (21) with noise and self-interference, we get the initial CMD estimation as and thus Hence, we choose the relaxation parameter as We can simplify (36) to δ = (σ 2 + σ 2 z )P , by considering the pilots to be normalized to unit energy per symbols, i.e., E{|p s | 2 } = 1.

G. Pilot placement
Most relevant to the performance of the proposed channel estimation scheme is the choice of pilot positions, represented by the set P. As we optimize second order derivatives, we have to be aware that the approximation error in h ex tends to grow quadratically in the distance to the nearest pilot.For that reason, it is best if P is distributed as uniformly as possible within I.This matter is complicated by the fact that orthogonal precoding transformations, such as the DSFT or fast Walsh-Hadamard transform (FWHT), work best if M and N are powers of 2. We are therefore targeting a transmit frame size of M × N and have P = N M − N M pilots.It is however not obvious how to distribute these uniformly in general.To remedy this, we propose a piloting scheme in Section V.

H. Complexity of estimators
Let us discuss some complexity aspects of the considered optimization problems.For the standard LMMSE estimator, we need to solve the unconstrained linear LS problem in (23).The complexity of ( 23) usually grows cubically with the frame size, i.e., by (N M ) 3 .However, due to the closed form solution in (24) the LS estimator matrix can be computed for each σ 2 offline.Then, the LS problem is reduced to a matrix vector product for a fixed σ 2 .Regarding the complexity of the proposed estimator scheme, we need to solve the optimization problem in (30).The problem in ( 30) is however a constrained LS problem and does not have a closed form solution.By using Tikhonov regularization [42], we can convert the constraint problem in (30) into an unconstrained LS problems as where Ω is another regularization parameter.Then, for each Ω a closed form solution exists which can be computed offline as for the LMMSE estimator.

I. Other considerations of the proposed estimator
Let us discuss some other design choices and aspects of the proposed channel estimator in (30).Our first design decision is to extend the optimization variable h ex rather than using padded convolution.Regarding the most common padding techniques for convolution, we observe that: • Zero-padding causes a significant amplitude drop near the frame boundaries, which does not fit our model for h as seen in ( 13).• Mirror-padding favors solutions that are flat at the frame boundaries.Although performing better than zeropadding, it still yields inferior estimations compared to the extension approach.• Circular padding leads to a leakage effect similarly to the OTFS piloting scheme in DD domain, cf.Section IV-A.
Let us explain the specific choice of minimizing the energy of the second order derivative.Minimizing the gradient does not yield satisfactory results, as the trigonometric polynomials making up the true solution are not close to being (piece-wise) linear.Using higher order derivatives requires more and larger kernels and therefore more computational time and memory.
In addition, the derivatives are less stable and often cause unreasonably large values to appear in the solution, especially near the frame boundaries.In fact, we found no significant improvements in performance for third order derivatives and even worse performance for derivatives of greater order.

J. Equalization and data recovery
Let us detail the equalizer to construct the transmitted TF frame from the received TF frame with the estimated channel and the recovery of the transmitted bits.The choice of a suitable equalization should be made based on the selected channel estimation scheme.Recall that we estimate the CMD and not the effective channel matrix with off-diagonal terms.
We furthermore aim at a data recovery on a per frame basis not considering iterative schemes.This makes one-tap equalization to a suitable scheme which we follow in this paper.We use a linear minimum mean square error (MMSE) equalizer and get the equalized TF frame as We demultiplex the TF frame to extract the precoded data frame x ∈ C M N by Then, x is linearly decoded, demodulated and decoded, yielding the transmitted information bits.

V. ACCORDION PILOT PLACEMENT
In this section, we introduce the proposed accordion pilot placement.Let us explain the choice of this name.At the transmitter, the precoded data frame is spread out to place pilots between the data symbols.This is required to properly estimate the channel.At the receiver, the pilots are then extracted and despreading is applied to obtain the initial data frame.This procedure is similar to the movement of an accordion and explains its naming.The fundamental idea of the proposed pilot placement is to use a fixed amount of pilots P in each row (or column) and successively shift the positions circularly by some fixed hop size µ ∈ Z.We have to carefully choose a suitable hop size, otherwise pilots remain clustered.

A. General idea of using lattices
To explain the idea behind finding a suitable candidate for the shift µ, we assume for now that N is divisible by P .Then, we can construct P from a lattice on Z2 of the form and consider the restriction of the pilot indices by where λ = (N + P )/P is the distance between two pilots within each row and µ ∈ Z is the circular shift from row to row.We target to find the most appropriate µ.For a given set P, we consider the minimal distance between mutually distinct points given by We may say P is uniformly distributed in the index grid I, if it maximizes the minimal distance, i.e., P is a solution to max P⊂I d(P), s.t.#P = P.
Unfortunately, d(P) can not be computed easily, but d(Λ λ,µ ) can.We therefore rather solve Note that Λ λ,µ contains 0 and we simply have = min Computing the squared minimal distance is actually a quadratic integer optimization problem.For fixed ∈ Z, it is easy to compute a minimizer k µ, like Moreover, any solution (k, ) to (46) satisfies which is fast to compute for any given µ.Finally, because Λ λ,µ = Λ λ,µ+λ for all λ ∈ Z, we can restrict the search space to µ = 0, . . ., λ − 1 and obtain the optimal shift by solving

B. General algorithm
We have seen how to construct P in an ideal case, i.e., if the P pilots distribute uniformly along each row of length N = N + P .The general algorithm to find a fitting accordion placement is given in Algorithm 1.The key idea is to determine the ideal shift value for an approximate lattice by rounding λ first and then computing µ according to (50) for the idealized setting, as seen in lines 1 to 4 of Algorithm 1.In line 5, we place pilots as uniformly as possible in a single row using rounding.From line 7 onwards, we take the row indices R and shift them circularly by µ and append the new indices to P. An example of the proposed accordion placement is shown in Fig. 3.

VI. NUMERICAL SIMULATIONS AND RESULTS
In this section, we present the simulation setup and numerical results.We compare the performance of different linear precoding transformations and evaluate the proposed channel estimation scheme.where k µ, as defined in (47). A. Numerical simulation setup For the numerical evaluation, we choose a typical URLLC scenario in which a vehicle receives short-frame messages from a base station [10].In this scenario, the vehicle has to reliably recover the transmitted bits from each frame.To do so, the channel is estimated and equalized on a per frame basis.
To evaluate pulse-shaped multicarrier systems in highmobility scenarios, the right choice of the simulation setup is essential.In Table I, we list the selected simulation and system parameters.We use the geometric-statistical channel simulator QuaDRiGa to generate the channels [43].To obtain doublydispersive channels, we update the channel samples generated by the simulator within the duration of one frame.The channel then becomes time-variant at higher velocities.We therefore configure the simulator to update the channel samples at a rate of 1 /B=0.2µs.For the channel model, we choose the 3GPP 38.901 UMi non line-of-sight (NLOS) model which takes R=58 multipaths into account.All together, the high sampling rate, the NLOS scenario, and high velocities allow us to obtain highly time-variant channels with our simulation setup.
To realize the pulse-shaped multicarrier system, we use LTFAT which provides a transceiver structure based on a polyphase implementation of filtering [44].We choose a TF product of T F = 1.25 to balance the trade-off between the signal to interference ratio and spectral efficiency [45].In the TF domain, we design the short-frame to consist of N = 64 time steps and M = 64 frequency steps [14].We use a bandwidth of B = 5 MHz and a frame duration of T f = 1 ms.This results in a time step size of T = 16 µs and a frequency step size of F = 78.125 kHz which are also referred to as symbol length and subcarrier spacing, respectively.At the Gabor filterbank, we utilize orthogonalized Gaussian-like pulses to synthetize and analyze the transmitted and received signal in the time domain, respectively.These pulses are generated by orthogonalizing a prototype pulse on a tight Gabor frame which is commonly referred to as S −1/2 -trick [34], [46]. 2  These pulses are identical by construction, i.e., γ = g, and each pulse is orthogonal to its translations on the TF grid.From the channel simulator, we get 58 multipaths with 5120 time samples for each of them.The total length of these channel samples corresponds to the duration of one frame, i.e., T f =1 ms.Then, we apply a time-varying convolution between the transmit signal and each multipath and obtain the superposition of all of them as received signal.To assure a cyclic convolution, we add a block cyclic prefix to the samples with appropriate length.

B. Comparison of distinct linear precoding setups
We numerically investigate the impact of applying different linear precoding transformations to the data frame, the gained TF diversity, and how this gain depends on the size of precoded data frame.Let us therefore detail the distinct setups considered in this subsection.Since we focus on precoding, we only place data symbols into the TF frame and assume full CSI knowledge of the CMD at the receiver.The perfect CMD is used for MMSE equalization as detailed in (38).To precode the data frame, we apply the DSFT, FWHT and fast Fourier transform (FFT) each as a 1D and 2D transformation, and we consider random precoding and without precoding as well.
In addition, we study setups in which we subdivide the TF frame into two, four, and eight sub-frames (SF) corresponding to SF-2, SF-4, and SF-8, respectively.Fig. 6 depicts the TF frame in which the three considered SF structures are shown.The data frame is divided by the number of SF into smaller data frames.Each is then separately precoded and mapped to the corresponding SF.The approach of using precoded SF is particularly interesting for URLLC, as it provides higher flexibility.For example, a vehicle can already process single SF to recover the transmitted bits before receiving the entire frame.From another perspective, the SF can be used in multiuser scenarios improving reliability and providing higher flexibly than OTFS.This however comes at a price.Recall that by applying precoding to the data symbols, we increase the reliability of modulation or in other words we gain TF diversity.This is due to the fact that equalization errors and self-interference are distributed over all symbols.By reducing the size of the precoded data frame, we also decrease the potential TF diversity.To measure the impact of TF diversity gain on the precoded symbols, let us use the normalized maximal symbol-error deviation (NMSED) as metric.
In Fig. 4, we study the mean square error (MSE), the uncoded BER, the NMSED as a function of the velocity for all investigated setups.Fig. 4a depicts the relative symbol MSE and shows that it is the same for all setups.Fig. 4b shows that all precoding functions achieve the same low BER when applied to the entire TF frame.In particular, 1D and 2D precoding lead to the same BER performance.We observe in general significant performance gains of precoded data frames compared to data frames without precoding.In the case of SF, we can observe that the BER increases by approximately 3 to 4 dB for each subdivision of the TF frame.The reason behind this behaviour can be explained when looking at NMSED in Fig. 4c.The precoding ensures that the error energy from self-interference and equalization near zero-crossings of h are equally distributed across all symbols in the TF frame.

C. Performance of channel estimation
Let us evaluate the performance of channel estimation in a linearly precoded multicarrier system which is shown in Fig. 2. We place both pilot and precoded data symbols into the TF frame.We utilize the FWHT to precode the data frame which offers reduced implementation complexity compared to DSFT precoding [47].The choice of precoding is however arbitrary   since any other linear transformation leads to the same TF diversity gain as shown in Section VI-B.The pilot symbols are placed according to Algorithm 1.We estimate the CMD and use MMSE equalization to revert the distortion incurred by doubly-dispersive channel.We study the proposed channel estimation scheme by solving the optimization problem in (30) in four different configurations and denote them as follows: smoothness regularized Hessian (SRH), smoothness regularized Hessian with noise-awareness (SRH-NA), smoothness regularized Hessian with mode-awareness (SRH-MA), and smoothness regularized Hessian with mode-and noiseawareness (SRH-MNA).This allows us to study the impact of taking noise-awareness as well as the mode-awarness into account.We detail the investigated estimators in Table II.
Fig. 5 depicts the uncoded and coded BER as function of SNR for different numbers of pilot symbols using convolutional hard-decision decoding at rate of 1 /3 in the latter case.In Fig. 5a, we show the uncoded BER using 256 pilots.Fig. 5b illustrates that with coding an error-free transmission is only achieved for the perfect CMD, SRH-MNA, and SRH-MA estimators at a SNR of 10.5 dB, 12 dB, and 13 dB, respectively.For all remaining estimators, an error floor between −35 and −38 dB is reached.Fig. 5c depicts the uncoded BER when 512 pilot symbols are used.By doubling the number of pilot symbols to 512, the standard LMMSE estimator performance is improved.Fig. 5d illustrates that with coding an error-free transmission is only achieved for the perfect CMD, SRH-MNA, and SRH-MA estimator at a SNR of 10.5 dB, 11 dB, and 13.2 dB, respectively.For all remaining estimators, an error floor between −46 and −48 dB is reached.Fig. 7 shows the uncoded BER as a function of the number of pilot symbols at a SNR of 15 dB with a relative velocity between 100 and 400 km /h.

VII. CONCLUSIONS
We proposed a novel channel main diagonal estimator scheme which minimizes the energy of the second order derivatives.This scheme considered noise including selfinterference power and the ratio of the channel spreading region for anisotropic regularization of the weighted Hessian matrix.The use of Tikhonov regularization allowed us to obtain an unconstrained least-squares problem where a closed form solution for each noise parameter exists and can be computed offline.We introduced a new pilot placement scheme that enables a more efficient use of resources and improved channel estimation.The numerical results showed that the proposed scheme allows an accurate channel estimation and equalization of received short-frame messages even in highly time-varying communication scenarios.

Fig. 1 .
Fig.1.Multi-and demultiplexing of the pilot vector and precoded data frame in the TF domain.

Fig. 4 .
Fig. 4. Performance comparison of different linear precoding transformations at 12 dB SNR assuming full CSI of CMD.

Fig. 5 . 2 Fig. 6 .
Fig. 5. BER as a function of the SNR for different channel estimation schemes at relative velocity of ∆v = 200 km /h.

Fig. 7 .
Fig. 7. BER as a function of the number of pilots in NLOS vehicular scenarios at a SNR of 15 dB for distinct channel estimators.