An Approximate Maximum Likelihood Method for the Joint Estimation of Range and Doppler of Multiple Targets in OFDM-Based Radar Systems

In this manuscript, an innovative method for the detection and the estimation of multiple targets in a radar system employing orthogonal frequency division multiplexing is illustrated. The core of this method is represented by a novel algorithm for detecting multiple superimposed two-dimensional complex tones in the presence of noise and estimating their parameters. This algorithm is based on a maximum likelihood approach and combines a single tone estimator with a serial cancellation procedure. Our numerical results lead to the conclusion that the developed method can achieve a substantially better accuracy-complexity trade-off than various related techniques in the presence of closely spaced targets.

Abstract-In this manuscript, an innovative method for the detection and the estimation of multiple targets in a radar system employing orthogonal frequency division multiplexing is illustrated. The core of this method is represented by a novel algorithm for detecting multiple superimposed twodimensional complex tones in the presence of noise and estimating their parameters. This algorithm is based on a maximum likelihood approach and combines a single tone estimator with a serial cancellation procedure. Our numerical results lead to the conclusion that the developed method can achieve a substantially better accuracy-complexity trade-off than various related techniques in the presence of closely spaced targets.
Index Terms-Dual-function radar-communication, frequency estimation, harmonic retrieval, maximum likelihood estimation, orthogonal frequency division multiplexing, radar processing.

I. INTRODUCTION
W IRELESS communication and radar sensing have been advancing independently for many years, even though they share various similarities in terms of both signal processing and system architecture. In the last few years, substantial research efforts have been devoted to the design of wireless systems able to perform communication and radar functions jointly. The interest in this class of systems, that accomplish joint communication and sensing (JCAS), has been motivated by the advantages they offer in terms of device size, power consumption, cost and efficiency radio spectrum usage with respect to traditional wireless systems in various applications [1].
In this manuscript we focus on a communication-centric JCAS approach, where the radar sensing function can be considered as an add-on to the given communication system. More specifically, we take into consideration a single-input single-output (SISO) JCAS system employing orthogonal frequency division multiplexing (OFDM); this modulation format has been adopted in various wireless communication standards, thanks to its robustness to multipath fading and to its relatively simple synchronization [2].
Correlation-based and DFT-based methods for joint rangevelocity estimation exploit prior knowledge of the received signal and, even if conceptually simple and computationally efficient, may generate poor radar images in the presence of closely spaced targets or strong clutter around them [14]. Such methods can be outperformed by subspace methods, like the well known multiple signal classification (MUSIC) algorithm and the estimation of signal parameters via rotational invariant technique (ESPRIT) at the price, however, of a significantly larger computational complexity [8]. An accuracy comparable to that of subspace methods can be achieved through various ML-based algorithms, which also require a significant computational effort. Relevant contributions to this field concern: 1) the use of the amplitude weighted linearly constrained minimum variance (AW-LCMV) method for estimating the parameters of multiple targets [10]; 2) the adoption of an alternating maximization approach to mitigate the computational complexity of ML estimation [11]; 3) the development of an iterative non-linear kernel least mean square (KLMS) based technique for the estimation of target range [12]; 4) the derivation of a ML method, based on a kinematic model of detected targets, for estimating target speed [13].
The work illustrated in this manuscript has been motivated by our interest in extending our ML-based estimator of multiple overlapped complex exponentials developed in [15] to a two-dimensional (2D) scenario, and to investigate the application of the resulting algorithm to the detection of multiple targets and the estimation of their range and Doppler in an OFDM-based JCAS system. The contribution provided by this manuscript is threefold and can be summarised as follows: 1) A novel iterative DFT-based algorithm, called complex single frequency-delay estimation (CSFDE), is developed for the ML estimation of a single 2D complex tone. This estimator is based on the periodogram method for coarse frequency estimation and on a new iterative algorithm for the estimation of frequency residuals and complex amplitude. The last algorithm requires the evaluation of multiple symplectic Fourier transforms (SFTs), but, unlike other estimation techniques, does not need a prior knowledge of the overall number of targets. Moreover, its derivation is based on: a) a new approximate expression of the ML metric; b) the exploitation of the alternating minimization technique.
2) A novel recursive algorithm, called complex single frequency-delay estimation and cancellation (CSFDEC), for the estimation of the parameters of multiple superimposed 2D tones is derived. This algorithm, that combines the CSFDE algorithm with a serial cancellation & refinement procedure, is applied to target range and Doppler estimation in the considered JCAS system.
3) The accuracy of the CSFDEC algorithm is assessed by extensive computer simulations and compared with that achieved by various related algorithms available in the technical literature.
Our numerical results lead to the conclusion that the CSFDEC algorithm outperforms all the other related estimators in terms of probability of convergence, and achieves similar or better accuracy in all the considered scenarios; in particular, it is able to reliably operate in the presence of multiple closely-spaced targets in scenarios in which DFT-based methods, subspace methods and other ML-based methods fail. In addition, the computational requirements of the CSFDEC algorithm are quite limited; this is due to the fact that it exploits a DFT-based method (namely, the CSFDE algorithm) and a mathematically simple serial cancellation & refinement procedure, that unlike ML-based and subspace methods, does not require matrix inversions and eigendecompositions. Moreover, the CSFDEC algorithm is an off-grid algorithm since, unlike most of the ML-based methods available in the technical literature, does not make use of a search grid in frequency estimation; this makes its application substantially easier than on-grid algorithms.
The remaining part of this manuscript is organized as follows. In Section II, the processing accomplished in an OFDM-based radar system is summarised and the model of the signal feeding the CSFDEC algorithm is briefly derived. Section III is devoted to the derivation of the CSFDE and CSFDEC algorithms, and to the assessment of their computational complexity. The CSFDEC algorithm is then compared, in terms of accuracy and complexity, with other estimation algorithms in Section IV. Finally, some conclusions are offered in Section V.

II. SYSTEM AND SIGNAL MODELS
This section focuses on the processing accomplished at the receive side of a SISO OFDM-based JCAS system; our main objectives are deriving the mathematical model of the received signal in the presence of multiple targets and illustrating some essential assumptions on which it relies. In the following, we take into consideration the transmission of a single frame, consisting of M consecutive OFDM symbols; such symbols can convey both pilot tones (for channel estimation and synchronization) and information data to be sent to a single or multiple receivers at different locations. However, what is relevant in our study is that the considered frame is sent over a wireless channel by a transmitter which is colocated with the considered receiver; consequently, the receiver has a full knowledge of the structure and content of the whole frame and of the transmission frequency, and exploits these information for sensing purposes only. The complex envelope of the transmitted signal conveying the mth OFDM symbol (with m = 0, 1, . . . , M − 1) of the considered frame can be expressed as (e.g., see [11, eq. (3)]) where q(t) is a windowing function, s m (n) is the channel symbol carried by the nth subcarrier of the mth OFDM symbol (with n = 0, 1, . . . , N − 1), N is the overall number of subcarriers, ∆ f = 1/T is the subcarrier spacing, T is the OFDM symbol interval, T s ≜ T + T G is the overall duration of the OFDM symbol and T G is the cyclic prefix duration (also known as guard time [4]). Following [11], a rectangular windowing function is assumed in this manuscript, so that q(t) = 1 for t ∈ [−T G , T ] and q(t) = 0 elsewhere.
Given the complex envelope (1), the radio frequency (RF) waveform radiated by the radar transmitter can be expressed as where f c denotes the frequency of the local oscillator employed in the up-conversion at the transmit side. Let assume now that the last waveform is reflected by a single scatterer (i.e., by a single point target), located at the (initial) distance R from the transmitter and moving at the radial velocity 1 v with respect to it. It is not difficult to show that, in this case, the complex envelope of the signal received by the JCAS system (i.e., by the colocated receiver) is (e.g., see [11, eq. (6)]) where τ ≜ 2R/c is the overall propagation delay, f D = 2v/λ is the Doppler shift due to target motion, λ = c/f c is the wavelength of the radiated signal and w(t) is the complex additive Gaussian noise (AGN) process affecting r(t).
The signal r(t) (3) undergoes analog-to-digital conversion followed by DFT processing. A simple mathematical model describing the sequence generated by the sampling of r(t) can be derived as follows. Substituting the right-hand side (RHS) of (1) in that of (3) and extracting the portion associated with the mth OFDM symbol from the resulting expression yields where Note that: 1) the phase of A(τ ) depends on the target delay τ only, whereas that of γ n (τ ) is proportional to both τ and the subcarrier index n; 2) the factor ξ n (f D , t ′ ) produces a time-dependent phase rotation influenced by both the target speed v and the subcarrier index n; 3) the factor ζ m,n (f D ) generates a phase rotation depending on both the OFDM symbol index m and the subcarrier index n, and accounts for the so called intersubcarrier Doppler effect (e.g., see [11,Sec. II,p. 3]). Based on (4), it is not difficult to show that, if |f D τ | ≪ 1, sampling r m (t ′ ) (4) at the instant t ′ m,l = τ + T (l/N ) yields with l = 0, 1, . . . , N − 1; here, ξ n,l (f D ) ≜ ξ n (f D , T l/N ) and w m (l) ≜ w(t ′ m,l ) is the Gaussian noise affecting r m (l). In the following, we also assume that: a) the sequence {w m (l); l = 0, 1, . . . , N − 1} can be modelled as additive white Gaussian noise (AWGN); b) the target speed is limited, so that |2v/c| ≪ 1/(M N ) and |f D |/∆ f = |f D T | ≪ 1. Consequently, the factors exp(j2π(lf D )/(N ∆ f )), ξ n,l (f D ), ζ m,n (f D ) appearing in the RHS of (5) can be neglected; this leads to the simplified signal model that represents our reference model in the derivation of the CSFDE and CSFDEC algorithms. The N signal samples acquired in the mth OFDM symbol interval are collected in the vector r m ≜ [r m (0), r m (1), . . . , r m (N −1)] T , that undergoes order N DFT processing. The nth element of the resulting DFT output vector where W m (n) is the AWGN sample affecting the nth subcarrier of mth OFDM symbol. Since the channel symbol s m (n) is known by the JCAS receiver for any n and m, the estimatê of the channel frequency response H m,n at the nth subcarrier frequency in the mth OFDM symbol interval can be computed; here, F r ≜ ∆ f τ is the normalized target delay, F D ≜ f D T s is the normalized Doppler frequency, 2 a q (F X ) ≜ exp(j2πqF X ) (with q = m or n and X = D or r) and is the noise sample affectingĤ m,n (8). It is worth pointing out that: 1) the parameter F r (F D ) satisfies the inequalities F r,min ⩽ F r ⩽ F r,max, (F D,min ⩽ F D ⩽ F D,max ), with F r,min = 0 and F r,max = 1 (F D,min = −1/2 and F D,max = 1/2); 2) in all our computer simulations, the channel symbols {s m (n)} belong to a N s -ary phase shift keying (PSK) constellation; 3) based on the last assumption, the noise samples {W m (n)} (see (9)) can be modelled as AWGN (the variance of each of them is denoted σ 2 W ); 4) without any loss of generality, the factor A(τ ) appearing in the RHS of (8) can be replaced by the complex gain A ≜ a exp(jϕ), accounting for the phase rotation due to τ , the path loss and the gain (attenuation) introduced by the target.
The model (8) has been derived for a single target, but can be easily generalised to the case of K point targets. In fact, in the last case, (7) becomes R m (n) = s m (n) so thatĤ m,n (8) can be expressed aŝ A k a m (F D k ) a n (−F r k ) +W m (n) , (11) with m = 0, 1, . . . , M − 1 and n = 0, 1, . . . , N − 1; in the last two formulas, F r k , F D k and A k denote the normalized delay, the normalized Doppler frequency and the complex gain, respectively, characterizing the kth target. In the following, we assume that these complex exponentials are ordered according to a decreasing strength, so that |A k | ≥ |A k+1 |, with k = 0, . . . , K − 1.
From (11) it can be easily inferred that: 1) the noisy samples {Ĥ m,n } of the 2D channel response acquired over a single frame can be modelled as the superposition of multiple 2D complex exponentials with AWGN; 2) target detection and estimation is tantamount to identifying the K complex exponentials forming the useful component of the sequence {Ĥ m,n } and estimating their parameters. Finally, it is important to point out that the two normalized frequencies characterizing each target need to be estimated jointly; in fact, if a 1D frequency estimator is used to estimate each of them separately, a complicated pairing problem has to be solved in order to avoid any ambiguity in target detection.

III. APPROXIMATE MAXIMUM LIKELIHOOD ESTIMATION OF TWO-DIMENSIONAL COMPLEX TONES
In this section, we first derive a novel algorithm for jointly estimating the parameters of a single 2D complex tone. Then, we show how this algorithm can be exploited to detect multiple superimposed tones and estimate their parameters through a procedure based on successive cancellations and refinements. Finally, we analyse the computational complexity of the developed algorithms, and discuss the similarities and differences of our multiple tone estimator with other related estimation techniques.

A. Joint Estimation of the Parameters of a Single Two-Dimensional Complex Tone
Let us focus on the problem of estimating the parameters of a single 2D complex tone affected by AWGN on the basis of the noisy observations {Ĥ m,n }, where (see (8) or, equivalently, (11) with K = 1) (12) with m = 0, 1, . . . , M − 1 and n = 0, 1, . . . , N − 1. It is easy to show that the ML estimates F D,ML , F r,ML and A ML of F D , F r and A, respectively, can be evaluated as whereF D ,F r andÃ are the trial values of F D , F r and A, respectively, is the mean square error 3 (MSE) computed over the whole set {Ĥ m,n }, is the square error between the noisy sampleĤ m,n (12) and its useful component evaluated under the assumption that F D =F D , F r =F r and A =Ã. Substituting the RHS of the last equation in that of (15) yields whereφ m ≜ 2πmF D andφ n ≜ 2πnF r . Then, substituting the RHS of (17) in that of (14) gives, after some manipulation, where is, up to a scale factor, the so called symplectic Fourier transform (SFT) of the sequence {Ĥ m,n }. It is important out that: 1) The metric ε(F D ,F r ,Ã) is really optimal in the ML sense, if a PSK constellation is adopted for the channel symbols {s m (n)}, so that, as already pointed out in the previous section, an AWGN model can be adopted for the noise sequence {W m (n)} (see (12)). On the contrary, if a QAM constellation is selected, the samples of that sequence are not identically distributed, having, in general, different variances (e.g., see [14]); consequently, in the last case, the ML metric can still be put in a form similar to that expressed by (14), but its terms {ε m,n (F D ,F r ,Ã)} cannot be uniformly weighted, being affected by different noise levels.
2) From (18) it is easily inferred that the optimization problem (13) does not admit a closed form solution because of the nonlinear dependence of the metric ε(F D ,F r ,Ã) (18) onF D andF r .
The approach we pursued in developing an approximate (but accurate) solution to (13) is based on: a) Expressing the dependence of the function ε(F D ,F r ,Ã) on the variablesF D andF r through the couples (F D,c ,δ D ) and (F r,c ,δ r ) such that andF r = F r,c +δ rFr .
b) Assuming that the residualsδ D andδ r (appearing in the RHS of (20) and (21), respectively) are small, so that Taylor series truncated to its first four terms (i.e., to the terms associated with k = 0, 1, 2 and 3) can be employed to accurately approximate the dependence of the function ε(F D ,F r ,Ã) on these variables. c) Exploiting an iterative method, known as alternating minimization (AM; e.g., see [16]) to minimise the approximate expression derived for ε(F D ,F r ,Ã); this allows us to transform the three-dimensional (3D) optimization (13) into a triplet of interconnected one-dimensional (1D) problems, each referring to a single parameter and, consequently, much easier to be solved than the original ML problem.
Let us show now how these principles can be put into practice. First of all, the exploitation of AM requires solving the following three sub-problems: P1) minimizing ε(F D ,F r ,Ã) 4 The decomposition of an unknown frequency into the sum of a multiple of a given fundamental frequency and a frequency residual is commonly adopted in the technical literature concerning ML frequency estimation (e.g., see [15] and references therein). 5 Note that the following definition represents a specific case of the matrix H (ZP) k 1 ,k 2 defined right after (33) (in particular, its corresponds to the choice The first sub-problem can be solved exactly thanks to the polynomial dependence of the cost function ε(F D ,F r ,Ã) (18) on the variableÃ. In fact, the function ε(F D ,F r ,Ã) is minimized with respect toÃ if 6 whereȲ (F D ,F r ) can be computed exactly through its expression (19) or, in an approximate fashion, through a computationally efficient procedure based on the fact that the matrixȲ (19), (24)-(26)). For this reason, if one of the normalized frequenciesF D andF r or both of them are not a multiple ofF D andF r , respectively, an approximate evaluation ofȲ (F D ,F r ) can be accomplished by interpolating 7 the elements of the matrixȲ s (29). Note also that the last matrix can be efficiently computed by performing an order N 0 inverse fast Fourier transform (IFFT) along the rows ofĤ (ZP) 0,0 (23), followed an order M 0 fast Fourier transform (FFT) along the columns of the resulting matrix.
Let us take into consideration now P2 and P3. Such sub-problems, unlike the previous one, do not admit closed form solutions. However, approximate solutions can be developed by: 1) representing the parameters F D and F r in the same form asF D (20) andF r (21), respectively, i.e. as F D = F D,c + δ DFD and F r = F r,c + δ rFr , respectively; 2) using the 2D periodogram method to estimate F D,c and F r,c ; 3) devising a novel algorithm for estimating the residuals δ D and δ r , i.e. for accomplishing the fine estimation of F D and F r , respectively. The fine estimation algorithm is derived as follows. Based on the representations (20) and (21) of the trial variablesF D andF r , respectively, the variablesφ m andφ n defined right after (17) are expressed asφ m = 2πm F D,c + mΩ andφ n = 2πn F r,c + n∆, respectively; here, Ω ≜ 2πδ DFD and∆ ≜ 2πδ rFr . Then, the following steps are accomplished: 1) the new expressions ofφ m andφ n are substituted in the RHS of (17); 2) the resulting expression is substituted in the RHS of (14) and the approximation (27) is adopted for exp(jmΩ) and exp(jn∆) under the assumption that bothΩ and∆ are small enough. 8 This yields, after some manipulation, the approximate expression for the function ε(F D ,F r ,Ã) (14); here, It is important to point out that: a) if both ρ D and ρ r are integers, the quantityȲ k1,k2 (ρ D , ρ r ) (32) represents the if the previous condition is not met, the quantityȲ k1,k2 (ρ D , ρ r ) can be evaluated exactly on the basis of (32) or, in an approximate fashion, by interpolating multiple adjacent elements of the matrix L D L rȲk1,k2 (see (33)).
Minimizing ε CSFDE (Ω,∆,Â) (30) is equivalent to maximizing the function ξ(Ω,∆,Â) (31). The last function can be easily maximized with respect to the variableΩ (∆) if ∆ (Ω) is known. Therefore, given∆ =∆, the estimatê δ D ≜Ω/(2πF D ) of δ D can be evaluated by taking the derivative of ξ(Ω,∆,Â) with respect toΩ and setting it to zero. In fact, this leads to the estimate 10 that represents one of the two solutions of the quadratic equation with X = Ω; here, A simpler estimate (denotedΩ ′ ) of Ω is obtained neglecting the contribution of the quadratic term in the left-hand side 9 Note thatĤ has the same structure asĤ , the only difference being represented by the fact that, in its definition,Ĥ 0,0 is replaced byĤ k 1 ,k 2 . 10 In the following equations, the dependence of the function is not explicitly specified to ease reading.
(LHS) of (35), i.e. setting a Ω = 0. This leads to a first-degree equation, whose solution is (with X = Ω) Dually, givenΩ =Ω, an estimateδ r ≜∆/(2πF r ) ofδ r is computed by taking the derivative of ξ(Ω,∆,Â) with respect to∆ and setting it to zero. This leads to a quadratic equation in the variable∆ whose structure is still expressed by (35) (with X = ∆); however, its coefficients are For this reason, the estimates∆ and∆ ′ of ∆ can be computed on the basis of (34) and (39), respectively. Given the estimateδ D (δ r ) of δ D ( δ r ), a fine estimateF D of F D (F r of F r ) can be evaluated on the basis of (20) ( (21)).
2) Estimation of the complex amplitude -The new estimatê A (i) ofÂ is evaluated by means of (28); in doing so, The index i is incremented by one before starting the next iteration. At the end of the last (i.e. of the N it th) iteration, the fine estimatesF D =F andÂ =Â (Nit) of F D , F r and A, respectively, become available. The CSFDE algorithm is summarized in Algorithm 1.
It is worth pointing out that: 1) the initial coarse estimateŝ F

B. Estimation of Multiple Two-Dimensional Tones
Let us show now how the CSFDE algorithm can be exploited to recursively estimate the multiple tones forming the useful component of the complex sequence {Ĥ m,n }, whose (m, n)th element is expressed by (11), where K is assumed to be greater than unity and unknown. The method we develop to achieve this objective is called complex single frequencydelay estimation and cancellation (CSFDEC) and is based on the idea of 1) separating the contribution of the first (and strongest tone) in the RHS of (11) from that of the remaining (K − 1) tones and 2) considering the latter contribution as part of the overall noise affecting the former one. Based on this representation of {Ĥ m,n }, an estimate of the parameters (A 0 , F D0 , F r0 ) can be evaluated through the CSFDE and can be employed to subtract the contribution of the first tone to {Ĥ m,n }, so generating a residual measurement. This estimation & cancellation procedure is repeated to recursively estimate the other tones on the basis of the computed residuals until the energy of the last residual falls below a given threshold; this generates, as a by-product, an estimate of K. Moreover, in the CSFDEC method, after detecting a new tone and estimating its parameters, a re-estimation technique is executed to improve the accuracy of both this tone and the previously estimated tones.
The CSFDEC algorithm is initialized by: 1) running the CSFDE algorithm to compute the initial estimatesF  of the new (i.e., of the ith) tone (if any); b) refining the estimates of the i tones available at the beginning of the considered recursion. The procedure employed for accomplishing all this consists of the three steps described below (the p th step is denoted CSFDE-Sp, with p = 1, 2 and 3) CSFDEC-S1 (spectral cancellation and estimation of a new tone) -In this step, the following quantities are evaluated (see the initialization part of the CSFDE algorithm): a) The residual spectrum where represents the contribution given by all the ith estimated 2D tones toȲ 0,0 andC 0,0 (Â ) is the contribution provided by the kth tone (with k = 0, 1, . . . , i−1) to the same matrix (the expression of the elements of the matrixC 0,0 (·, ·, ·) is derived in Appendix A; see (60)). If the overall energy ε 0,0 where T CSFDEC is a proper threshold, the algorithm stops and the estimatê K = i of K is generated.
d) The spectral coefficients with k 1 , k 2 = 0, 1, 2, 3; here, ρ (46) and (47), respectively), whereas represents the contribution given toȲ r ) by all the estimated tones (in particular, the termȲ k1,k2 (·, ·; ·, ·, ·) appearing in the RHS of (55) represents the leakage due to the k th estimated tone for (F D , F r ) = (F  i /(2π) (see (20) and (21), respectively). The evaluation ofF It consists in repeating the previous step for each of the detected tones, starting from the first tone and ending with the last (i.e., with the (i+1)th) one. This means that, when re-estimating the kth tone, the leakage due to all the other (i − 1) tones is removed (with k = 0, 1, . . . , i). This allows to progressively refine the amplitude, normalized Doppler frequency and normalized delay of each tone, so generating the final estimates. Note that, in principle, this re-estimation procedure can be repeated multiple (say, N REF ) times.

C. Computational Complexity of the Proposed Algorithms
The computational complexity, in terms of number of floating point operations (flops), can be assessed for both the CSFDE and the CSFDEC algorithms as follows. 12 First of all, the overall computational cost of the CSFDE is expressed as where C 0 (CSFDE) (C i (CSFDE)) represents the computational cost of its initialization (each of its iterations). The cost C 0 (CSFDE) is evaluated by summing 13 : 1) the contribution due to the computation of the couple (l,p) on the basis of (43); 2) the contribution due to the computation of the matrices {Ȳ k1,k2 } on the basis of (33), including the evaluation of the spectrumȲ 0,0 ; 3) the contributions due to the evaluation of the estimatesΩ and∆, respectively, on the basis of the quadratic equation (34). The cost C i,CSFDE , instead, is evaluated by summing: 1) the contribution due to the computation of Y (F D ,F r ) on the basis of (19) or of the interpolation of a few adjacent elements of the matrixȲ s (29 ); 2) the contributions due to the evaluation ofρ D (ρ r ) on the basis of (46) ((47)); 3) the contribution due to the evaluation ofÂ on the basis of (28); 4) the contribution due to the computation of the quantitȳ Y k1,k2 (ρ ) (54) through the interpolation of a few adjacent elements of the matrix L D L rȲk1,k2 (see (33)) for the considered values of (k 1 , k 2 ); 5) the contributions due to the computation ofΩ and∆ on the basis of (34). Based on these considerations and the mathematical results illustrated in [20, App. C], it can be proved that and I D (I r ) is the interpolation order adopted in the Doppler (range) domain for the evaluation ofȲ k1,k2 (ρ (54). Note that, for small values of I D and I r (e.g., if a 2D linear or barycentric interpolation is used; see [19]), the contribution of the second term of the RHS of the last equation can be neglected, so that the order of the whole computational cost is well approximated by its first term, i.e. by the term originating from DSFT processing.
Our assessment of the complexity of the CSFDEC algorithm is based on the considerations illustrated in [15] for its 1D counterpart. Based on these, it can be proved that so that the required computational effort depends linearly on K. The last result holds if tone re-estimation is not accomplished and all the tones are detected (i.e.,K = K). The first term appearing in the RHS of the last equation accounts for the initialization (and, in particular, for the computation of the matricesȲ 0,0 (22) and {Ȳ k1,k2 ; (k 1 , k 2 ) ̸ = (0, 0)} (33)), whereas the second one for the fact that, in the CSFDEC algorithm, the CSFDE is executed K times. Note that the computational cost related to the estimation of the 2D-tones detected after the first one and to their frequency domain cancellation does not play an important role in this case. However, if tone re-estimation is executed in the CSFDEC algorithm, the parameter K appearing in the RHS of (58) is 13 Note that the evaluation of the estimate of the tone complex amplitude is neglected, being based on (28), that requires a negligible computational effort.
replaced by K 2 , since this task involves all the estimated 2Dtones.

D. Comparison of Our Multiple Tone Estimator With Related Techniques
The CSFDEC algorithm is conceptually related with: 1) the 2D periodogram method [6] (denoted 2D-FFT in the following); 2) the CLEAN algorithm [21], [22]; 3) the modified Wax and Leshem (MWL) algorithm developed in [21] and [22]. The 2D-FFT, CSFDEC and CLEAN algorithms are FFT-based techniques; however, the last two algorithms are more complicated than the first one. In fact, unlike the 2D-FFT, both the CSFDEC and CLEAN algorithms perform leakage compensation, iterative cancellation of the detected targets and tone re-estimation. Note also that the CLEAN algorithm, unlike the CSFDEC algorithm, does not accomplish fine frequency estimation and employs coarse frequency estimates in its target cancellation procedure. The MWL algorithm, similarly as the CSFDEC algorithm, relies on the idea of turning a complicated 3D optimization problem (see (13)) into a triplet of three simpler 1D optimization problems. However, unlike the CSFDEC algorithm, it requires the computation of orthogonal projections (and, consequently, of matrix inversions) and the definition of a search grid.
If frequency re-estimation is ignored, the following considerations can be formulated for the computational complexity of the above mentioned algorithms: 1) The computational effort of the CLEAN algorithm is expressed by the sum of three distinct contributions, related to its initialization (which is based on the 2D-FFT), its tone cancellation and its leakage compensation; these three costs are shared with the CSFDEC algorithm, that requires the computation of other 12 (or 15) additional DSFTs (see the previous subsection).
2) The CLEAN and MWL algorithms perform cancellation and leakage compensation in the time domain, whereas the CSFDEC algorithm performs these tasks in the frequency domain. This explains why the computational complexity of the CSFDEC cancellation, being in the order of M 0 × N 0 , is L D × L r times larger than the cost of the same task for the CLEAN and MWL algorithms.
3) The computational cost of leakage removal can be neglected for the CSFDEC algorithm because of its simplicity (complex scalar subtraction), even if it has to be accomplished on multiple DSFTs; on the other hand, the CLEAN and MWL algorithms execute this task in a similar fashion as cancellation, thus requiring O(M N ) operations.
4) The computational effort of the MWL algorithm is expressed by the sum of two distinct contributions, one due to its initialization, the other one to its iterations. The cost of the initialization task is the same as that of the Wax and Leshem (WL) algorithm illustrated in [21]. The cost of each iteration, instead, is given by that of the WL algorithm plus a contribution due to leakage compensation; the last cost is O (K M N ), being equal to that required by the CLEAN algorithm for the same procedure.
To sum up, the 2D-FFT is the least demanding algorithm; moreover, its computational effort is independent of the overall number of detected targets (i.e., of K). The MWL algorithm is less computationally demanding than the CLEAN algorithm since it exploits alternating maximization. The CSFDEC algorithm has the highest initialization cost and, usually, is computationally heavier than all the other algorithms mentioned above. However, the dependence of its complexity on K is limited and weaker than that exhibited by the CLEAN algorithm; in addition, the CSFDEC algorithm is substantially more accurate than all the other algorithms in the presence of multiple closely spaced targets, as shown in the following section.

IV. NUMERICAL RESULTS
The accuracy of the CSFDEC algorithm has been assessed in five different scenarios and compared with that achieved by the related algorithms introduced in Section III-D and four other algorithms, namely: 1) the 2D MUSIC algorithm [8], [9], [23]; 2) the approximate ML method recently proposed in [11] and dubbed modified 14 alternating projection ML (MAP-ML) algorithm; 3) an estimation algorithm based on the same 2D cost function as the MAP-ML algorithm, but not using the alternating projection method for its maximization (this algorithm is denoted modified Zhang ML, MZ-ML); 4) the expectation maximization (EM) algorithm. A detailed description of all these algorithms and an analysis of their computational complexity are provided in [20, Sec. IV], and are omitted here for space limitations; here, we limit to point out that the MAP-ML and MZ-ML algorithms require a significant computational effort, since they are ML-based and do not turn, unlike the CSFDEC and MWL algorithms, a multidimensional optimization problem into significantly simpler sub-problems.
In our work, the first three scenarios (denoted S1, S2 and S3) are characterized by a couple of targets having amplitudes A 0 = A 1 = 1, but differ for the assumptions we make about their ranges and speeds. In fact, we have that: 1) in S1, the target ranges are R 0 = 10 m and R 1 = 10 + 3R bin m, whereas the target velocities are v 0 = 1.39 m/s and v 1 = 1.39 + 3v bin m/s (here, R bin = c/(2N ∆ f ) and v bin = c/(2M f c T s ) represent the size of the range bin and velocity bin, respectively, that characterize our FFT processing in the absence of oversampling); 2) in S2, the range R 0 (velocity v 0 ) is uniformly distributed 15 , with d = 0, 1, . . . , 5. In the last scenario, ∆ R (d) = 0.8 + 0.05 d (∆ v (d) = 0.8+0.05 d) represents the tone spacing normalized with respect to R bin (v bin ) and the signal-to-noise ratio (SNR), 14 In our work, the approximate ML-based algorithm devised in [11] has been properly modified to adapt it to our signal model (11) (that does not account for inter-pulse and inter-subcarrier Doppler effects). 15 In both S2 and S3, R 0 and v 0 are independent random variables.
which, in general, is defined as SNR ≜ K−1 k=0 |A k | 2 /σ 2 W , is equal to 0 dB. The fourth scenario (denoted S4), instead, is characterized by K ∈ {2, 3, . . . , 9}, i.e. by a varying number of targets. In addition, for any K, the amplitude, range and velocity of the kth target are given by A k ≜ 10 −k∆a/10 , R k ≜ R 0 + 1.8 k R bin and v k ≜ v 0 + 1.8 k v bin , respectively (with k = 0, 1, . . . , K − 1), the random variables R 0 and v 0 are generated in the same way as S3, and the SNR is equal to 5 dB for the strongest tone. In the last scenario (denoted S5), the range and velocity of the kth target are generated according to the simple mathematical laws given for S4, but R bin = v bin = 1.1 and A k = 1 for any k (with K ∈ {3, 5, 7, 9}) are assumed; moreover, the SNR ranges from −15 dB to 25 dB.
It is important to point out that: 1) in S1 the spacing of the two targets in the velocity and range domains is fixed and not small, whereas in S2 (S3) the spacing in both the range and velocity domains is small and fixed (variable); 2) S4 is characterized by a variable number of close targets; 3) S5 is characterized by a variable number of close targets and by a variable SNR; 4) in all the considered scenarios, positive velocities have been selected for all the targets and the overall number of targets has been assumed to be known.
In our computer simulations, the estimation accuracy of each algorithm has been assessed by evaluating the root mean square error (RMSE) for the range (RMSE R ) and velocity (RMSE v ) of the considered targets. Moreover, the following parameters have been selected for the OFDM modulation: 1) overall number of subcarriers N = 32; 2) overall number of OFDM symbols/frame M = 32; 3) subcarrier spacing ∆ f = 250 kHz; 4) cyclic prefix duration T G = 12.5 µs (consequently, the OFDM symbol duration is T s = 1/∆ f + T G = 16.5 µs); 5) carrier frequency f c = 78 GHz; 6) cardinality of the PSK constellation N s = 32. Then, we have that R bin = 18.75 m and v bin = 3.64 m/s.
In S1 the accuracy of the all the considered estimation algorithms has been assessed. Moreover, the following choices have been made for these algorithms 16  Some numerical results referring to S1 are given in Fig. 1, where the RMSE R and RMSE v characterizing all the considered algorithms is shown for SNR ∈ [−15, 25] dB (in these figures and in all the following ones, simulation results are represented by labels, whereas continuous lines are drawn to ease reading). From these results it is easily inferred that: 1) The CSFDEC, CLEAN and MWL algorithms achieve good accuracy (very close to the CRLB) thanks to their use of cancellation and refinement procedures.
2) The RMSE curves for the 2D-FFT and 2D-MUSIC algorithms exhibit a floor at high SNRs.
3) The MAP-ML and the MZML algorithms perform similarly since both aim at maximizing the same cost function.
4) The EM algorithm can be fruitfully exploited to refine the estimates generated by other methods and, in particular, if employed jointly with the 2D-FFT algorithm, achieves an estimation accuracy similar to that provided by the MAP-ML and MZML algorithms.
As far as point 2) is concerned, it is worth pointing out that: a) The accuracy of the 2D-FFT algorithm is intrinsically limited by the adopted FFT order, whereas that of the 2D-MUSIC algorithm by the discretization of its steering vector; for this reason, when the spectral leakage is limited (i.e., when the targets are well spaced), the RMSE achieved by these two algorithms at large SNRs is well approximated by the square root of the variance of a random variable uniformly distributed over an interval whose width is equal to the step size of the grid of the considered algorithm, i.e., to (X 2 res /12), with X = R or v; here, R res = R bin /N 0 = 1.171875 m, v res = v bin /M 0 = 0.2276 m/s for the 2D-FFT, whereas R res = ∆ R and v res = ∆ v for the 2D-MUSIC). b) For given values of M and N , the accuracy of the 2D-FFT algorithm improves if the associated oversampling factors increase; unluckily, oversampling can provide a limited improvement by itself, since it does not add extra information, but simply allows to interpolate adjacent spectral samples.
c) The accuracy of the 2D-MUSIC algorithm can be improved by selecting a finer grid, at the price, however, of an higher computational complexity, as shown in [20, Sec. IV-B]. 18 Note that, in our simulations, positive trial values are always considered for target velocities, without loss of generality. d) Both the 2D-FFT and 2D-MUSIC algorithms do not execute refinement and/or re-estimation steps.
These considerations apply to all the following results shown for the two above mentioned algorithms.v In addition, in analyzing the results shown in Fig. 1, readers should keep in mind that: 1) the computational complexity of the CSFDEC (CLEAN) algorithm is approximately 17 (39) times higher than that of the 2D-FFT, 19 whereas that of the MWL algorithm is very close to it; 2) the complexity of the 2D-MUSIC, MAP-ML and MZML algorithms is 3481, 577 and 2593 times higher than that of the 2D-FFT algorithm, respectively; 3) the computational cost of the EM algorithm is approximately 149 (671) times smaller than that of the MAP-ML (MZML) algorithm.
Some numerical results referring to S2 are provided in Fig. 2 1) The CSFDEC, MWL, CLEAN, MAP-ML, MZML and EM algorithms are substantially more accurate than the 2D-FFT and 2D-MUSIC techniques. In particular, the RMSEs in range (velocity) of the 2D-FFT and 2D-MUSIC algorithms are 3.9 (4) and 1.5 (1.68) times higher, respectively, than that of the above mentioned group of algorithms at SNR = 0 dB; moreover, these performance gaps, in terms of both RMSE R and RMSE v , tend to increase by a factor 1.75 if the SNR is incremented by 5 dB.
2) The trend of both the RMSE R and RMSE v curves referring to the 2D-MUSIC and 2D-FFT algorithms does not follow that of the corresponding CRLB; for this reason, these algorithms are ignored in following.
3) The SNR threshold of the CSFDEC, CLEAN, MAP-ML and EM algorithms is about −10 dB, whereas that of the MWL algorithm is substantially higher (about −5 dB).
4) The MAP-ML algorithm performs similarly as the MZML algorithm; however, since the latter estimator requires an higher computational effort than the former one, it is ignored in the following.
It is also important to point out that the considerations illustrated about the computational complexity of the various algorithms in S1 still hold; however, the complexities of the CLEAN, 2D-MUSIC and MWL algorithms are 64, 5497 and 1.01 times higher than that of the 2D-FFT algorithm, respectively.
In S3, the RMSEs have been evaluated for different values of the normalized tone spacing ∆ R and ∆ v ; some numerical results referring to this scenario are illustrated in Fig. 3, that shows the dependence of RMSE R and RMSE v , respectively,   These results lead to the following conclusions: 1) The lowest threshold in range estimation is achieved by the CSFDEC algorithm (more specifically, in the considered scenario, its threshold is found at the normalized spacing ∆ R (2) = 0.9); 2) the lowest threshold in velocity estimation is achieved by both the CLEAN and CSFDEC algorithms. Note also that the complexity of the CLEAN is approximately 1.6 times higher than that of the CSFDEC algorithm in this case.
Based on the considerations illustrated above, in S4 we restrict our attention to the CSFDEC, CLEAN, MWL, MAP-ML and EM algorithms. Moreover, our performance analysis  does not concern estimation accuracy, but the probability of failure (P f ), i.e. the probability that convergence is not achieved, so that large estimation errors can be generated. In our computer simulations, we have observed that large estimation errors occur more frequently as K increases. To detect the frequency of occurrence of these errors, we have counted, in each simulation run, the number of failure events for each of the considered algorithms; in practice, an event of this type is detected whenever the absolute value of the range error and that of the velocity error (or only one of these errors) exceed the thresholds ∆ϵ r = c/(4N ∆ f ) = 9.375 m and ∆ϵ v = c/(4M f c T s ) = 1.82 m/s, respectively. 20 Moreover, in generating our results for S4, the following changes have been made with respect to the previous scenarios Note that the spacing between adjacent trial values is the same as S1-S3 in both domains. 20 Note that ∆ϵr (∆ϵv) correspond to half the size of the range (Doppler) bin characterizing the processing of the considered algorithms.
The probability of failure estimated for K = 2, 3, . . . , 9 is illustrated in Fig. 4-a). From this figure it is easily inferred that: 1) the MAP-ML and EM (MWL) algorithms exhibit a P f greater than 10 −2 for K ≥ 4 (K ≥ 5); 2) the CSFDEC algorithm is substantially more robust than all the other algorithms since it is characterized by a P f not exceeding 10 −4 for K ≤ 8; 3) the CLEAN algorithm achieves a P f smaller than 10 −2 for K ≤ 7. These results evidence that the CSFDEC algorithm performs substantially better than the other estimators in the presence of multiple closely spaced targets. This feature plays a fundamental role in the estimation of extended targets, whose radar image is usually a dense point cloud.
In S4 the computational effort required by the CSFDEC, CLEAN, MWL, MAP-ML and EM algorithms in terms of both computation time (CT) 21 and estimated number of mega FLOPs (MFLOPs) has been also evaluated. Our results, illustrated in Fig. 4-b), evidence that: 1) the MWL (MAP-ML) algorithm requires the lowest (highest) complexity in terms of both CT and MFLOPs; 2) the complexity of the CLEAN algorithm is not far from that of the MAP-ML algorithm; 3) the complexities of the CSFDEC and EM algorithms are comparable and placed in the middle.
Our last results are shown in Fig. 5 and concern the RMSE R and RMSE v of the CSFDEC algorithm in S5; N it = 10 and N REF = 3 have been selected for this algorithm (the values of its remaining parameters are the same as S1). These results show that: 1) The SNR threshold of the CSFDEC algorithm depends on K; for instance, this threshold for the range estimation is found at a SNR ≈ −9 (SNR ≈ −5) dB for K = 3 and 5 (K = 7 and 9).
2) The range and velocity estimates generated by the CSFDEC algorithm are unbiased. Note, for instance, that for both K = 7 and 9, the RMSE R and RMSE v and the corresponding CRLB curves are separated by a constant SNR gap when the SNR exceeds the above mentioned threshold; further computer simulations have evidenced that this gap can be reduced by increasing the values of N REF and, more evidently, of N it , at the price, however, of an higher computational effort.
Finally, based on all the results illustrated above, we can state that, thanks to its accuracy, its limited complexity increase with respect to the 2D-FFT method and its ability to resolve multiple closely spaced point targets, the CSFDEC algorithm represents a good candidate for target detection and estimation in future OFDM-based radars.

V. CONCLUSION
In this manuscript, a novel algorithm for the detection of a single 2D complex tone and the estimation of its parameters has been derived. Moreover, it has been shown how combining this algorithm, dubbed CSFDE, with a serial cancellation procedure leads to the development of a new algorithm for the detection and the estimation of multiple 2D tones. Then, the last algorithm, called CSFDEC, has been applied to the detection of multiple targets, and to the estimation of their range and velocity in an OFDM-based SISO radar system. In addition, it has been compared, in terms of accuracy and computational complexity, with various estimation methods available in the technical literature. Our simulation results evidence that the CSFDEC algorithm is very accurate and outperforms all the other related estimators in the presence of multiple closely spaced targets. Future work concerns the application of the CSFDEC algorithm to other JCAS systems.

A. Spectral Cancellation of a Two-Dimensional Complex Tone
In this paragraph, the derivation of expression of the vector C 0,0 (·, ·, ·) appearing in the RHS of (51) is sketched. This vector is evaluated to cancel the contribution of the sequence s m,n F D ,F r ,Ā =Āw m Dw n r (59) to the vectorȲ 0,0 (22) (the adopted cancellation procedure is expressed by (50) and (51)); here,w D ≜ exp(j2πF D ) andw r ≜ exp(−j2πF r ). SinceȲ 0,0 is the order (M 0 , N 0 ) DSFT of the zero-padded versionĤ (ZP ) 0,0