Estimating and Tracking Wireless Channels Under Carrier and Sampling Frequency Offsets

This article addresses the challenge of estimating and tracking wireless channels under carrier and sampling frequency offsets, which also incorporate phase noise and sampling time jitter. We propose a novel adaptive filter that explicitly estimates the channel impulse response, carrier frequency offset, and sampling frequency offset by minimizing the mean-square error (MSE) and, when the estimated parameters are time-varying, inherently performs tracking. The proposed filter does not have any requirements for the structure of the waveform, but the digital transmitted waveform must be known to the receiver in advance. To aid practical implementation, we derive upper bounds for the filter's step sizes. We also derive expressions for the filter's steady-state MSE performance, by extending the well-known energy conservation relation method to account for the self-induced nonstationarity and coupling of update equations that are inherent in the proposed filter. Theoretical findings are verified by comparison to simulated results. Proof-of-concept measurement results are also provided, which demonstrate that the proposed filter is able to estimate and track a practical wireless channel under carrier and sampling frequency offsets.

As such, estimating and compensating frequency offsets in those cases is essential.Although not the main focus of this work, a popular example are orthogonal frequency division multiplexing (OFDM) systems, where synchronization of the carrier frequency at the receiver must be performed accurately in order to avoid loss of orthogonality between the subcarriers.Those systems can only tolerate carrier offsets that are a fraction of the spacing between the subcarriers without large degradation in performance [3], [4], [5].The performance of OFDM systems can also degrade due to sampling frequency offsets [4], [6], although this is often less significant.Various methods for joint carrier and sampling frequency offset estimation and compensation in OFDM systems exist [7], [8], [9], [10], [11], [12], [13], [14], not to mention abundant works on only either of the offsets.However, those methods largely rely on the properties that are strictly characteristic to OFDM and are not directly applicable to other applications.
Carrier and sampling frequency offsets also pose a major challenge in known-interference cancellation.The capability to cancel known interference is a fundamental prerequisite of physical layer security schemes that envision preventing eavesdropping by superposing the signal of interest with some interference that is known only to the legitimate receiver.Perfect known-interference cancellation has been for long assumed feasible in theoretical physical layer security works without practical basis [15], [16].However, lack of proper frequency synchronization actually has a considerable negative effect on the cancellation performance [16], [17].This is leading to the development of interference cancellation methods with built-in frequency synchronization [18], [19].
Frequency synchronization, as well as time synchronization, is also a key issue in interference alignment and distributed beamforming.Interference alignment and distributed beamforming envision concurrent transmissions that result in a substantial increase in wireless network's total capacity [20] or an increase in range and energy efficiency [21].In addition, since distributed beamforming entails directing more power in the desired direction, less is scattered in the undesired directions, possibly increasing security [21].However, again the challenges in realizing the benefits of interference alignment and distributed beamforming include coordinating the transmitters for distributed information sharing plus carrier and sampling synchronization, so that the transmissions combine as necessary at the destination [22].
Bistatic radars are promising supplements to classical monostatic systems, and they too face the challenge of synchronization.
Unlike a monostatic radar, a bistatic radar has a transmitter and receiver on separate platforms which results in various operational advantages like, e.g., additional information about the scene, as the scattering characteristics of objects depend strongly on the line-of-sight vectors to the transmitter and receiver.Another advantage is the potential of cost reduction by using one transmitter, or even illuminators of opportunity, and several passive receivers [23].However, separation of the transmit and receive platforms necessitates time and frequency synchronization for coherent signal processing and range measurement [24], [25].
Similar challenges arise in the acoustic domain.For example, in underwater acoustic communications the use of wideband modulation and low velocity of acoustic waves mean that Doppler shifts have a significantly larger impact than in the electromagnetic domain and these shifts need to be compensated for [2].In acoustic echo control, the sampling frequency offsets between separate devices, if not compensated for, can cause poor echo cancellation performance [26], [27].
It is often so that adaptive filters are used in such nonstationary environments and consequently frequency offsets compromise the conventional filters' performance [28].To that end, various extended adaptive filters have been proposed that are able to track certain nonstationarities or nonlinear impairments.For example, least mean squares (LMS)-type gradient descent has been used for explicit time-delay estimation [29] as well as power amplifier distortion [30] and IQ imbalance compensation [31].An LMStype adaptive algorithm has been proposed for joint channel estimation and explicit sampling rate correction in acoustic echo control applications [27].The adaptive notch filter proposed in [32] is a simple algorithm capable of extracting a nonstationary narrowband signal buried in noise, being essentially a carrier frequency offset tracker.However, a single general algorithm for tracking a channel under both carrier and sampling frequency offsets, without specific requirements on the waveform, is still missing.
The purpose of this article is to present an efficient adaptive algorithm for estimating and tracking a channel under time-varying carrier and sampling frequency offsets when the receiver knows the signal that is to be transmitted, or at least a considerable part of it, in advance.The presented algorithm aims to be waveform-agnostic and not strictly rely on the characteristics of the underlying system.Hence, it is potentially applicable to the aforementioned concepts and beyond.We provide a thorough analysis on the optimal selection of the algorithm parameters (viz.three step sizes) to facilitate rapid convergence, and we carry out theoretical steady-state analysis for the proposed algorithm by extending the well-known energy conservation relation [33].The extended relation introduces nonstationary a priori errors for each update equation and decouples the errors of separate update equations to account for the algorithm's self-induced nonstationarity.Several supporting simulations are provided, which verify the theoretical results and demonstrate that the algorithm is able to track time-varying frequency offsets.Furthermore, proof-of-concept measurement results are presented, which illustrate that the algorithm is capable of explicitly estimating and tracking a wireless channel and Fig. 1.General system model considered in this work, focusing on the carrier and sampling frequency offsets together with the channel impulse response between a transmitter and a receiver.In this work we assume that the digital transmitted signal x(n) is known to the receiver.frequency offsets between two radios.The proposed algorithm is positioned with regards to the existing works and comparisons are made throughout.
The rest of this article is organized as follows.Section II introduces a general system model and in Section III the novel adaptive algorithm is presented for estimating and tracking the parameters of the system model.Also, in Section III bounds for the algorithm's step sizes are derived.In Section IV, expressions are derived for the steady-state mean-square error (MSE) of the proposed algorithm, by introducing an energy conservation relation that accounts for the algorithm's self-induced nonstationarity.Section V provides a comparison of the theoretical MSE results to simulations, proof-of-concept experimental results, and a brief comparison.Finally, conclusions of the study are given in Section VI.
Notation: Small boldface letters are used to denote vectors, and capital boldface letters are used to denote matrices, e.g., w and R. Furthermore, the symbol * denotes Hermitian conjugation for vectors and complex conjugation for scalars.The identity matrix is denoted by I and a zero vector is denoted by the boldface letter 0, both with dimensions compatible to each context.The iteration index is placed as a subscript for vectors and between parentheses for scalars, e.g., w n and v(n).All vectors are column vectors, except for two vectors, namely, the input data vector denoted by x n and its resampled counterpart y n , which are taken to be row vectors for convenience of notation.Lastly, E[•] is the statistical expectation operator.

II. SYSTEM MODEL
The system model considered in this work focuses on the time-varying sampling and carrier frequency offsets between a transmitter and a receiver along with the channel that separates the two as illustrated in Fig. 1.The relative sampling frequency offset between the two devices is denoted as η o + β(n), where η o = ΔT /T x represents the fundamental time-invariant offset with ΔT = 1/f d − 1/f x being the difference between the sampling periods at the receiver and transmitter, f d is the sampling frequency at the receiver, f x is the sampling frequency Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
at the transmitter, and β(n) is the time-varying offset, including sampling jitter.The carrier frequency offset is denoted as o + φ(n), where o denotes the fundamental time-invariant offset o = ω d − ω x between the receiver and transmitter carrier frequencies, ω d is the carrier frequency at the receiver, ω x is the carrier frequency at the transmitter, and φ(n) is the time-varying offset, including phase noise.Lastly, we denote the finite impulse response of the complex-valued channel with order M as w o .
The transmitter broadcasts a complex signal x(n) that, in its discrete-time form, is known to the receiver.However, due to noise, channel, and mismatches in carrier and sampling frequencies at the transmitter and the receiver, the discrete-time signal at the receiver becomes where v(n) is the measurement noise, y o n accounts for sampling x(t) with sampling frequency offset η o + β(n) so that and the multiplicative term e j n i=1 o +φ(i) accounts for the carrier frequency offset.This is a general system model that is relevant, e.g., for the following scenarios.Firstly, it holds in cases when known training data is used to estimate the channel impulse response and frequency offsets to improve subsequent information demodulation.Secondly, this general system model rather directly applies to the bistatic or multistatic radar scenario, in which case the receiver is familiar with the transmitted signal, but is interested in tracking the channel and frequency offsets to estimate range/velocity.Thirdly, in case of the known-interference cancellation scenarios, the received noise v(n) can be considered to contain an unknown signal of interest, which is uncorrelated to the known signal x(n) that is suppressed to facilitate processing the signal of interest.

III. ADAPTIVE ESTIMATION AND TRACKING
In order to derive an algorithm for estimating and tracking the parameters described in the system model, we first define the instantaneous error of the estimation process as where w n−1 , (n − 1), and η(n − 1) are respectively the estimates of the channel's impulse response w o , carrier frequency offset o , and sampling frequency offset η o at iteration n, and y n is the result of resampling x(n) with η(n − 1), so that The instantaneous error e(n) will contain v(n) and excess noise from the algorithm's operation.In case of known-interference cancellation, the instantaneous error e(n) would additionally contain some unknown signal of interest.The aim of the adaptive filter is to update iteratively the system model parameter estimates w n , (n), and η(n) so that a nonnegative cost function J(n) is reduced successively (5) This will generally ensure that after every iteration, the adaptive filter improves its estimation of the parameters that we are trying to model.

A. Mean-Square Error
We define the cost function as the mean-square value of the estimation error, i.e., the MSE: We opted for the MSE over other potential error measures, e.g., weighted least squares, because of the simplicity of the resulting algorithm.Note that in practical applications of adaptive filtering, the use of ensemble averaging is not feasible as we are adapting the filter in an on-line manner, based on a single realization of the estimation error, e(n), as it evolves across iteration index n.Therefore, during the derivation of the proposed algorithm, we proceed by ignoring the expectation operation in the cost function (6) as is typical to the stochastic gradient descent method [34].
We apply the method of stochastic gradient descent for a sequential computation of the model parameters, using gradients of the performance surface in seeking its minimum.Even though only one of the estimated parameters, namely the channel impulse response w n , is complex-valued, then in the following derivation we also consider (n) and η(n) to be complex-valued, as this will lay a clear consistent foundation for later carrying out the steady-state analysis of the adaptive filter.In order to accommodate for complex-valued (n) and η(n), we use the real and imaginary part operators, {z} and {z}, where appropriate.
We obtain the gradient vector at any point on the performance surface by differentiating the cost function (6) with respect to the model parameter estimates, resulting in where and y n is the derivative of y n .
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
When using (8b) and (8c) in practice, we are only interested in the partial derivative of a complex function e(n) with respect to the real part of the parameters (n) and η(n).Therefore, we can simplify the partial derivatives relying on the Cauchy-Riemann equations [35] and consider only the real parts of the partial derivatives so that

B. Algorithm
We formulate the updating rules of the algorithm using the stochastic gradient in (7) by moving in the opposite direction of the gradient vector so that where w 0 , (0), and η(0) are initial guesses and μ w , μ , and μ η are fixed positive step size parameters that allow to control the convergence speed and steady-state performance of the algorithm.For computing the gradient vector at every iteration of the algorithm, (9b) and (9c) are to be used in (10b) and (10c).However, for carrying out the steady-state analysis, we will rely on the full complex-valued gradient and use (8b) and (8c) in (10b) and (10c); while (8a) is always used in (10a).We also acknowledge that the partial derivative (9c) with regards to the sampling rate offset estimate η(n − 1) includes a time derivative of the resampled signal vector.If the third derivative of y n exists, then it is beneficial to use the centered first-order divided difference, which has an approximation error of order two [36, p. 172], so that This is equivalent to considering w n−1 to be time-invariant and taking the centered first-order difference of (y n w n−1 ) .Alternatively, the first-order backward divided difference can be used, which does not require computation of y n+1 nor the existence of the third derivative, but has an approximation error of order one.
To produce y n , the sampling rate of the know signal x n needs to be converted.Various methods exist for arbitrary sampling rate conversion (SRC) [37], such as, e.g., the Lagrange interpolator [38], but the used SRC method can be selected independently of the proposed algorithm.If prior knowledge of the estimation parameters is available, then this knowledge may be used to speed up the start-up process of the algorithm.Otherwise, w 0 , Fig. 2. System model with the proposed adaptive filter.
Algorithm 1: LMS-Type Frequency Offsets Tracking. 1: end for 14: end procedure (0), and η(0) can be initialized to zero.Conclusively, the adaptive algorithm for iteratively estimating and tracking a wireless channel under carrier and sampling frequency offsets is listed as Algorithm 1 and illustrated in Fig. 2. It should be noted that in order for the algorithm to be able to handle sampling frequency offsets, several filter taps should be allocated, i.e., M > 1, even if the channel itself can be modeled by a single complex coefficient.Furthermore, in general there are several equivalent formulations for complex-valued adaptive filters [39, p. 69] and corresponding equivalent formulations exist also for the proposed algorithm.An open-source implementation of the algorithm is available as part of an adaptive filters toolkit. 1

C. Computational Cost
A useful property of the proposed algorithm, mainly due to the chosen cost function, is its computational simplicity -each iteration of the algorithm requires only a limited number of straightforward calculations.Evaluation of the proposed algorithm requires 12 M + 26 real-valued multiplications and 14 M + 13 real-valued additions at each iteration.There can be various ways to perform specific calculations, but the resulting overall filter complexity will be of the same order of magnitude.However, these numbers do not include the arbitrary SRC, which can be implemented in several ways with varying complexity and accuracy.For example, Lagrange interpolation can be implemented with computational complexity growing linearly with the interpolation order [40].

D. Convergence Properties
For a given system with a fixed set of parameters, the choice of step sizes μ w , μ , and μ η is effectively the only way to affect the performance of the algorithm.For example, in order to speed up the initial adaptation process, it might be desirable to use large step sizes, which minimize the instantaneous error at every iteration as much as possible, yet do not cause the algorithm to diverge.An approximate way of finding the upper bounds for the step sizes of an adaptive filter is by expanding the instantaneous output error by a Taylor series expansion [41], [39, p. 86], which in this case gives where Δw n−1 , Δ (n − 1), and Δη(n − 1) are the estimate updates and h.o.t.denotes the truncated higher-order terms of the expansion.From (10a), (10b), and (10c), by considering the full complex-valued gradient vector, we get respectively.For sufficiently small Δw n , Δ (n), and Δη(n), the values of the higher-order terms in ( 13) can be neglected and, therefore, in the following analysis we approximate the expansion without them.Thus, evaluating the partial derivatives in (13) and substituting in (14a), (14b), and (14c) yields after direct simplification In order to ensure convergence, it is essential that the norm of the left hand side is not greater than that of the right hand side so that The goal in ( 16) is reached if the following relation holds: which in turn implies the following bounds on the choice of the step sizes μ w , μ , and μ η : However, the expressions above are merely necessary conditions for the stability of the proposed algorithm.The actual values of the step sizes to achieve stability are slightly smaller than the derived bounds due to the used approximation, i.e., discarding the higher-order terms in the error expansion.
We see that all quantities in (18) are positive, so the convergence properties depend on the slope but not on the sign of the gradient vector, and that the upper bounds are coupled, so the step sizes are to be selected collectively.That is, upper bound for each step size depends on the other two step sizes and convergence can be reached only if the relation in ( 17) is satisfied.The preceding analysis on the Taylor series expansion of the instantaneous error provides two results.Firstly, the step size bounds that are necessary but not sufficient conditions for the algorithm to converge and, secondly, these bounds can potentially be used to derive a normalized variant of the algorithm.As is, the adaptive filter assumes fixed step sizes, but an approach could also be developed that varies the step sizes to optimize convergence speed and subsequent steady-state performance.

E. Comparison
The application-specific methods for estimating a wireless channel and frequency offsets typically require the waveform to have a certain structure.The most general of those techniques aims to suppress known interference so as to provide physical layer security and relies on the waveform being cyclic with some period L [19].Evaluation of that method for one cyclic block with length L requires 25 L + 9 real-valued multiplications, 18 L − 1 real-valued additions, L + 1 real-valued divisions, evaluating atan2() L + 1 times, and calculating the L-point discrete Fourier transform at least once.This puts the referenced and proposed methods roughly on par in terms of computational complexity for a single data point.However, due to its block-based nature, the reference method can take advantage of parallel processing.Also, methods that rely on features built into the waveform generally require fewer samples than the proposed algorithm to provide accurate parameter estimates.Then again, the repetitive waveform structure required by the reference method could be a vulnerability in physical layer security applications.

IV. STEADY-STATE ANALYSIS
An important performance measure of an adaptive filter, which is typically used in the literature, is its steady-state excess mean-square error (EMSE) [35].In this section, we will carry out the derivation to express the total EMSE in terms of three EMSEs, each related to an update equation in (10).The analysis developed in this section relies on energy conservation arguments [33] and on decoupling the errors of separate update equations by solving a system of linear equations [42].In order to accommodate the errors accumulated by the frequency offset update equations, we extend the existing methodology to account for what we will refer to as the self-induced nonstationarity.Furthermore, to make the analysis tractable, we omit the time-varying terms φ(n) and β(n) of the system model here.That is, the focus is on steady-state analysis rather than tracking analysis, considering a quasi-static channel.

A. Self-Induced Nonstationarity
In practice, the frequency offset estimates (n) and η(n) are bound to differ from the actual parameters o and η o , resulting in estimation errors . This is especially so during the start-up phase of the algorithm but also during the steady state, as gradient noise affects the estimates at each iteration.Therefore, the accumulating estimation errors n i=1 ˜ (i − 1) and n i=1 η(i − 1) inevitably cause a phase shift and fractional time delay, or self-induced nonstationarity, which the channel estimate w n will then try to compensate for.In order to proceed with the steady-state analysis, we first need a way to express how those accumulated estimation errors affect the channel update equation (10a).
Based on (3), we define the total a priori error as which is simply the error between the received signal and the estimated signal but discarding the noise term v(n).By shifting both sides of the a priori error equation in phase by − n i=1 (i − 1) and in time by − n i=1 η(i − 1), we get where, for notational simplicity, we have denoted the phase and time shifted a priori error as and T n is an arbitrary time-shift matrix of size M × M that, when multiplying with x n , delays the signal x n by n i=1 η(i − 1).From (20), we can define w o n as In order to proceed, we call on the following assumption.
A.1: At the steady state, as n → ∞, the instantaneous estimation errors ˜ (n) and η(n) satisfy the conditions where f max is the maximum frequency component of x n .This is a reasonable assumption because in steady state we expect the estimation errors to vary around zero.Relying on A.1, we can use linear approximation [43] to write (20) as where • denotes the Hadamard product, i.e., element-wise multiplication, and ηn is the row vector By expanding (23), and ignoring the cross-terms that include both ˜ (n − 1) and η(n − 1), as they are very small under A.1, ( 23) can be rewritten as which, by substituting in (22) for index n − 1, is simply Finally, taking wn = w o n − w n to be the estimation error of the channel and reversing the phase and time shift introduced in (20), the a priori error can be expressed as + y n w o n−1 j˜ (n − 1)e j n i=1 (i−1) and we denote the three terms on the right-hand side as the a priori errors of the three update equations so that e n w,a (n) = y n wn−1 e j n i=1 (i−1) , (27a) where the superscript n denotes this first set of definitions for the a priori errors.

B. Mean-Square Performance
Following the well-known energy conservation relation method [28], we also define the following second set of a priori errors e η,a (n) = y n w n−1 η(n − 1)e j n i=1 (i−1) , (28c) so that the total error e(n) is the sum of the a priori errors and the measurement noise Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Similarly, we define the a posteriori errors e w,p (n) = y n wn e j n i=1 (i−1) , (30a) A.2: The noise sequence v(n) is stationary, with variance σ 2 v , and statistically independent of the a priori errors e w,a (n), e ,a (n), and e η,a (n).
Under the above justifiable assumption, we find that the MSE is equivalently given by where Employing the energy conservation relation method and relying on the two sets of a priori errors, it is shown in the Appendix that the following equations hold: This system of equations can now be solved for the EMSEs ζ w , ζ , and ζ η .To do so, we consider the following two cases.1) Using Separation Principle: One way to solve the equations in (33) is by imposing the following assumption. A.
This assumption is reasonable at the steady state since the behavior of the a priori errors is less likely to be sensitive to the input data.It is similar to the separation principle assumption made in, e.g., [33], [42], and allows us to write Furthermore, we make the following assumptions.

A.4:
In steady state for a static channel, as n → ∞, the channel estimate is close to the actual channel w n−1 → w o .
A.5: In steady state, for sufficiently small η o , the following equalities hold: A.6: In steady state, the two sets of a priori errors are equivalent, i.e., E|e n w,a (n Using the assumptions A.3 through A.6, and solving (33) for ζ w , ζ , and ζ η , we obtain the following expressions for the EMSEs of the proposed algorithm: where the denominator γ is and R is the covariance matrix R = E[x * n x n ], P is the covariance matrix P = E[(x n ) * x n ], and Q = w o (w o ) * .
Note that, in order for the algorithm to remain stable, the denominator of the EMSEs needs to be positive.If we consider an approximation of the denominator without the self-induced nonstationarity terms, i.e., the last two terms in (36), then this result has an equivalent implication to that of the simple approximation (18), which we derived using the Taylor series expansion of the instantaneous error.
2) Assuming Gaussian White Input Signals: For Gaussian white input signals (with R = σ 2 x I), relying on A.4 and A.5, (33) can be more accurately solved by resorting to the following independence assumption.
A.7: At steady state, the estimation errors wn , ˜ (n), and η(n) are all statistically independent of x n , x n w o , and x n w o .This is an extension of the assumption, which is widely used for analysing the performance of adaptive filters [33].Relying on the independence assumption A.7 and following the same reasoning that is used for analysing the steady-state performance of the LMS adaptive filter [35, p. 296], it can be verified that where k is either w, , or η, and we have used that where we are first relying on the independence of successive samples of x(n) and consequently on the samples being identically distributed as well.The last equation in (37) does not hold precisely but is an approximation, because the derivative itself is not identically and independently distributed.Still, for a channel w o with a relatively flat frequency response, this approximation can be practical, as will be seen in the results section.Using A.4, A.5, A.7, and solving (33) for ζ w , ζ , and ζ η , we obtain where the denominator γ is the same for all equations: V. NUMERICAL RESULTS In order to verify the theoretical steady-state MSE expressions and evaluate the performance of the proposed algorithm, the theoretical results are herein first compared to steady-state simulations, where the channel and frequency offsets are assumed to be known to the algorithm and time-varying terms are omitted, and then time-varying simulations together with proof-of-concept RF measurements are presented.

A. Steady-State Results
In Fig. 3, the steady-state theoretical MSEs obtained from expressions (35) and ( 38) are compared with the MSE observed in simulations.The simulations are run with different channel weight vectors w o , each of length M = 3 with a rather flat frequency response.The input signal x n is Gaussian of unit variance and the noise v(n) is Gaussian with variance σ 2 v = 10 −3 .From here on out, in order to make the results relatable, we refer to the sampling frequency offset as Δf = f d − f x instead of η o .The simulated frequency offsets are o = 6 kHz and Δf = 5 Hz, which, considering a carrier frequency of 2.4 GHz and sampling frequency of 2 MHz, is equivalent to a 2.5 ppm oscillator inaccuracy.Note that the MSE expressions do not depend on the frequency offset values, since the offsets themselves inherently do not affect the energy conservation relation.This is in alignment with our extensive simulation results for practical ranges of o and Δf (that are not shown herein), as steady-state MSE is indifferent w.r.t. the offset values.Thus, the simulated results are only plotted for these two example frequency offsets.
Each simulation result is the steady-state statistical average of 1024 runs, with 5000 iterations in each run.The average of the last 2500 entries of the ensemble-average curve is then used as the simulated MSE value.Oversampling is used to prevent interpolation errors from skewing the simulation results.In Fig. 3, the analysis focuses separately on either frequency offset estimation combined with the channel estimation.The comparison shows that both expressions are in good match with simulation results at small values of μ and μ η .However, (38) gives a better match with the simulation results for larger μ and μ η values, which supports the use of A.7.
Figs. 4 and 5 compare the theoretical MSE obtained from (38) with the simulated MSE for various μ w over a range of μ or μ η .Again, the results show a good match between theoretical and simulated results, especially at smaller step size values, when the steady-state assumptions are better justified.However, in general the sampling frequency offset update equation is not well suited for operating with disproportionally selected step sizes -carrier frequency offset can usually be recovered, but if the signals become unaligned in time because of persisting large estimation errors in sampling frequency offset, then this can be difficult to recover from.Furthermore, Figs. 4 and 5 also illustrate the relevance of the the step sizes' upper bound (18).For visual clarity, only a single upper bound is calculated and plotted by taking the two step sizes, which are varied, to be equal in (18).As the step sizes approach the upper bound, performance of the filter deteriorates, and, since the filter leaves the steady state, the match between theoretical and simulated MSE results also declines.Finally, Fig. 6 presents a comparison of the theoretical and Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.simulated MSEs of the proposed algorithm when all of the system parameters are simultaneously estimated.As shown by all the foregoing numerical results in Figs.3-6, the theoretical results match very well with the simulations.

B. Time-Varying Results
In this subsection, using simulations, we analyze the performance of the proposed filter when the frequency offsets are time-varying, i.e., we focus on the effect of φ(n) and β(n) on the algorithm's performance.Fig. 7 illustrates the filter's ability to track long-term changes in the time-varying terms.The simulations are started with perfect knowledge about the initial state of the channel and with zero frequency offsets.Then both frequency offsets are varied over time either gradually or abruptly as shown in Fig. 7 with the dashed lines.Other simulation parameters are kept the same as previously, including the ensemble averaging.The simulation results indicate that the adaptive filter is able to track those changes, regardless of whether the parameters change gradually or abruptly.As a result, the MSE is stable over time, except for a brief readjustment period during the abrupt frequency offset changes, which is expected.
In contrast, Fig. 8 demonstrates the filter's tracking performance under short-term changes, i.e., phase noise and sampling time jitter.Both are modelled as first-order autoregressive processes with the process parameters α φ and α β close to one and the variances being σ 2 φ and σ 2 β (the exact values of which are given in Fig. 8).The algorithm is run for 10 6 iterations and, again, the simulations are started with perfect knowledge of the initial state of the channel, yet without knowledge about  the noise processes.The simulations illustrate three cases: no adaptation at all, adaptation of only the channel estimate w n , and adaptation of all the parameters.The case without adaptation serves as a baseline for the MSE performance in the given noisy circumstances, while the other cases illustrate the benefits of adapting the channel and frequency offset estimates.The results show that, even though excessive phase noise and sampling jitter can degrade the algorithm's performance, adapting all the parameters still has a clear benefit compared to limited or no adaptation.

C. Experimental Results
The experiment is carried out indoors using two USRP-2900 software-defined radios with dipole antennas.The radios have internal temperature-compensated crystal oscillators with frequency accuracy of couple parts per million, presenting a fair scenario for analyzing the algorithm.The radios are positioned in the opposite corners of an office room with about five meters line-of-sight distance between them as shown in Fig. 9.As such, the experimental setup is static, with only the inherent oscillator drifts contributing a slowly time-varying component.The measurements are done in a relatively quiet section of the 2.4 GHz ISM frequency band, so that signals from other wireless devices do not affect the measurements, and using a sampling rate of 2 MHz.The transmitter broadcasts a bandlimited Gaussian noise signal, which is known to the receiver entirely.As such, the experiment illustrates the known-interference cancellation scenario, where the residual error signal could contain a signal of interest.Two signals bandwidths, 1 MHz and 0.5 MHz, are used with transmit powers −60 dBm/Hz or −90 dBm/Hz.The receiving node receives the bandlimited noise signal over the air and records it.The algorithm is then run offline on the recordings.
Length of the estimated channel vector w n is taken to be M = 9, which is more than sufficient for this scenario, and all of the estimated parameters are initialized to zero.For the algorithm to converge, it is required that the known and received signal streams be coarsely aligned in time (i.e., the difference in the two streams' starts may not exceed M − 1 samples).That coarse alignment is provided by onset detection -comparing the received signal's energy to a threshold.Fig. 10 shows the measured signal spectra at different stages of the system model.It can be observed that suppression of the known interference is not significantly affected by its bandwidth.Furthermore, when the received known interference is substantially above the noise floor then the MSE, i.e., the residual signal, is much higher than the measurement noise floor.This is caused by the nonlinearities induced in the USRP-2900 RF front-ends, which the algorithm does not account for.When the received known interference is not so powerful, those nonlinearities do not affect cancellation.Based on measurements at other received known-interference power levels that are omitted for brevity, in this scenario the algorithm requires that the signal be at least 4 dB above the noise floor in order to provide stable parameter estimates.
Finally, Fig. 11 demonstrates the algorithm's performance for the purpose of known-interference cancellation while estimating and tracking the channel together with the frequency offsets (Residual 2 and 3) as opposed to estimating and tracking the channel without compensating for the frequency offsets (Residual 1).It is evident that explicit adaptation of frequency Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.offsets gives better short-term and long-term performance.The results also show how continuous frequency offsets tracking is necessary in practice (Residual 4, 5 and 6), due to their time-varying nature.Again, it is clear that the experimental MSE does not reach the noise floor, as excessive phase noise, sampling time jitter, and nonlinear distortions degrade the performance of the algorithm.Nevertheless, the experimental results demonstrate the efficiency of the proposed algorithm in estimating and compensating for time-varying carrier and sampling frequency offsets of an unknown channel.
We compared the proposed algorithm to the method in [19] using a separate set of measurements with a cyclic bandlimited Gaussian noise waveform having period L. The two algorithms achieved a similar level of MSE eventually as long as the period L was chosen so that the carrier frequency offset remained within the reference algorithm's estimation range.As such, only the proposed algorithm's results are presented in the figures for brevity.The reference algorithm does have an advantage over the proposed algorithm in that it provides estimates of the channel and frequency offsets quicker.However, this advantage of the reference method relies on the assumptions that the used waveform is cyclic with period L and the combination of period L and sampling rate is appropriate for the frequency offsets.The latter of which significantly limits the acceptable range of L. The proposed algorithm, however, is not limited to cyclic waveforms and, as such, is also free from the related estimation range and accuracy limitations.

VI. CONCLUSION
This article proposed an adaptive filter for jointly and explicitly estimating the channel impulse response, carrier frequency offset, and sampling frequency offset between a transmitter and receiver pair.The proposed algorithm relies on the stochastic gradient descent method minimizing the mean-square error and is therefore computationally simple, yet effective.Compared to existing methods, the proposed adaptive filter facilitates estimating the channel and frequency offsets without requirements on the used waveform.Stability and convergence of the algorithm depend on the proper selection of step sizes in relation to the other system parameters.Hence, upper bounds for the step sizes were derived and presented.Furthermore, this article also provides a theoretical steady-state analysis of the proposed adaptive filter.Novel expressions for the excess mean-square error were derived by extending the energy conservation relation to account for the self-induced nonstationarity inherent in the proposed adaptive filter.Validity of the theoretical expressions was corroborated through comparison to simulations.Also, simulation results were presented for time-varying and noisy frequency offsets.Finally, the algorithm was validated on measurement data.

APPENDIX
The following analysis extends the energy conservation relation [33], which is established by expressing the update equations in (10) in terms of the estimation errors wn , ˜ (n), and η(n).Subtracting both sides of (10a) from w o n , both sides of (10b) from o , and both sides of (10c) from η o , we get Furthermore, by multiplying both sides of equation (40a) with y n e j n i=1 (i−1) from the left, (40b) with y n w n−1 e j n i=1 (i−1) , and (40c) with y n w n−1 e j n i=1 (i−1) , we see that the a priori (28) and a posteriori (30) estimation errors are related via Equations ( 40) and ( 41) provide an alternative representation of the adaptive filter in terms of the error quantities.This is useful, as it will allow relating the steady-state behavior of these errors.So, rearranging (41a), (41b), and (41c) allows us to express the total error e(n) separately in terms of the three sets of a priori and a posteriori errors: Substituting the right-hand sides of the above into (40a), (40b), and (40c), gives respectively where on each side those identities, we have a combination of a priori and a posteriori errors, while the step sizes cancel out.By evaluating the energies of both sides, we find that the following energy equalities hold: (47) Substituting (47) into (44a), taking the expectation of both sides of (44a), (44b), and (44c), using that E wn 2 = E wn−1 These equalities are given in terms of the a priori and a posteriori errors.However, we know from (41) how those errors are related.Therefore, using (41) the above collapse to the following error variance relations in terms of the a priori errors and noise only: Finally, substituting (29) into the equations in (50) while also relying on A.2, we arrive at the equations in (33).

Fig. 3 .
Fig. 3. Simulated and theoretical MSE curves relying on the separation principle and Gaussian input versus μ for μ w = 0.0025 and μ η = 0 on the left and versus μ η for μ w = 0.005 and μ = 0 on the right.

Fig. 4 .Fig. 5 .
Fig. 4. Simulated (only markers) and theoretical (solid lines) MSE curves at various μ w versus μ for μ η = 0.The dashed vertical line indicates the upper bound for the two step sizes when μ w = μ .

Fig. 10 .
Fig. 10.Power spectral densities of the transmitted, received, and residual signals along with the noise floor at the receiver in steady state, discarding the start-up phase of the algorithm.

2 . 2 + y * n y n 2 y n w o n− 1 η(n − 1 ) 2 .
)|η(n)| 2 + |e η,a (n)| 2 |y n w n−1 | 2 = |η(n − 1)| 2 + |e η,p (n)| 2 |y n w n−1 | 2 .(44c)Comparing (44a) with (44b) and (44c), we see that the main difference concerns the interpretation of the terms w o n − w n and w o n − w n−1 .While the term on the left-hand side of (44a) can be recognized as wn , just like the terms on the left-hand sides of (44b) and (44c), the second difference is not wn−1 since, due to the self-induced nonstationarity, wn−1 is defined as wn−1 = w o n−1 − w n−1 in terms of w o n−1 and not w o n .In order to explain the relevance of the energy relation equations to the steady-state analysis of the adaptive filter, we first need to relate w o n − w n−1 2 to wn−1 2 .To do so, we can writew o n − w n−1 2 = w o n−1 + w o n−1 j˜ (n − 1) + y * n y n 2 y n w o n−1 η(n − 1) − w n−1 (45)Recall that the first three terms on the right-hand side within the squared norm constitute w o n by means of linear approximation as in the derivation of(26).Based on (45), we getw o n − w n−1 2 = wn−1 2 + w o n−1 j˜ (n − 1)(46)The last two terms on the right-hand side of which can be related to |e n ,a (n)| 2 and |e n η,a (n)| 2 by writing

Taneli
Riihonen (Senior Member, IEEE) received the D.Sc.degree in electrical engineering from Aalto University, Espoo, Finland, in 2014.He is currently a tenure-track Associate Professor with the Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland.His research interests include physical-layer OFDM(A), multiantenna, multihop, and full-duplex wireless techniques with current research interest includes the evolution of beyond 5G systems.Vincent Le Nir received the Ph.D. degree in electronics from the National Institute of Applied Sciences, France, in 2004.He is currently a Senior Researcher with the Royal Military Academy, Brussels, Belgium.His research interests include digital communications and signal processing in the wireless and wireline domains, MIMO communications, space-time coding, OFDM and multicarrier-code-division multipleaccess, turbo-equalization, software defined, and cognitive radio.Marc Adrat received the Diploma and Dr.-Ing.degrees in electrical engineering from RWTH Aachen University, Aachen, Germany, in 1997 and 2003, respectively.He is currently the Head of the Software Defined Radio (SDR) Research Group, Fraunhofer FKIE, Wachtberg, Germany.His research interests include digital signal processing for mobile tactical radio communications, and emerging technologies like in-band full-duplex communications.Since more than 10 years, he is a Guest Lecturer with RWTH Aachen University for a course on channel coding.