On Performance of Sparse Fast Fourier Transform Algorithms Using the Flat Window Filter

The problem of computing the Sparse Fast Fourier Transform(sFFT) of a <inline-formula> <tex-math notation="LaTeX">$K$ </tex-math></inline-formula>-sparse signal of size <inline-formula> <tex-math notation="LaTeX">$N$ </tex-math></inline-formula> has received significant attention for a long time. The first stage of sFFT is hashing the frequency coefficients into <inline-formula> <tex-math notation="LaTeX">$B(\approx {K})$ </tex-math></inline-formula> buckets named frequency bucketization. The process of frequency bucketization is achieved through the use of filters: Dirichlet kernel filter, aliasing filter, flat filter, etc. The frequency bucketization through these filters can decrease runtime and sampling complexity in low dimensions. It is a hot topic about sFFT algorithms using the flat filter because of its convenience and efficiency since its emergence and wide application. The next stage of sFFT is the spectrum reconstruction by identifying frequencies that are isolated in their buckets. Up to now, there are more than thirty different sFFT algorithms using the sFFT idea as mentioned above by their unique methods. An important question now is how to analyze and evaluate the performance of these sFFT algorithms in theory and practice. In this paper, it is mainly discussed about sFFT algorithms using the flat filter. In the first part, the paper introduces the techniques in detail, including two types of frameworks, five different methods to reconstruct spectrum and corresponding algorithms. We get the conclusion of the performance of these five algorithms, including runtime complexity, sampling complexity and robustness in theory. In the second part, we make three categories of experiments for computing the signals of different SNR, different <inline-formula> <tex-math notation="LaTeX">$N$ </tex-math></inline-formula>, and different <inline-formula> <tex-math notation="LaTeX">$K$ </tex-math></inline-formula> by a standard testing platform and record the run time, percentage of the signal sampled, and <inline-formula> <tex-math notation="LaTeX">$L_{0},L_{1},L_{2}$ </tex-math></inline-formula> error both in the exactly sparse case and general sparse case. The result of experiments is consistent with the inferences obtained in theory. It can help us to optimize these algorithms and use them correctly in the right areas.


I. INTRODUCTION
The Discrete Fourier Transform(DFT) is one of the most important and widely used techniques in signal processing and mathematical computing. The most popular algorithm to compute the DFT is the fast Fourier Transform(FFT) invented by Cooley and Tukey. The algorithm can compute the DFT of a signal of size N in O(N logN ) time and use O(N ) samples. FFT dramatically simplifies the operation process; however, with the emergence of big data problems, the FFT is no longer fast enough. Furthermore, sometimes it is hard to acquire a sufficient amount of data to compute the DFT. These two problems become the major computational bottleneck in many applications. It motivates the need for new algorithms that can compute the Fourier Transform in sub-linear time The associate editor coordinating the review of this manuscript and approving it for publication was Wenming Cao . and that use only a subset of the input data. People thought of many ideas to realize such an algorithm. Later, they focused on the study of the characteristics of the signal itself. The research found that a large number of signals are sparse in the frequency domain; only K frequencies are non-zeros or are significantly large. This feature is universal and inherent in signals that cover many fields(e.g., audio, video data, medical image, etc.). In this case, when K << N , one can retrieve the information with high accuracy using only the coefficients of the K most significant frequencies. So the sFFT has been proposed and achieved excellent results. The research of sFFT has been a hot topic in signal processing research since its birth; it was named one of the 10 Breakthrough Technologies in MIT Technology Review in 2012.
The firsts stage of the sFFT algorithm is bucketization such that the value of the bucket is the sum of the values of the frequency coefficients that hash into the bucket.
The number of buckets is denoted by B, and the size of one bucket is denoted by L. The process of bucketization is achieved through the use of filters. The effect of the Dirichlet kernel filter is to make the signal convoluted a rectangular window in the time domain; it can be equivalent to the signal multiply a Dirichlet kernel window of size L(L << N ) in the frequency domain. The typical application using the Dirichlet kernel filter is the AAFFT algorithm. The effect of the aliasing filter is to make the signal multiply a comb window in the time domain; it can be equivalent to the signal convoluted a comb window of size B(≈ K ) in the frequency domain. The typical application using the aliasing filter is the FFAST algorithm. The effect of the flat filter is to make the signal multiply a mix window in the time domain; it can be equivalent to the signal convoluted a flat window of size L(L << N ) in the frequency domain. The typical application using the flat filter is the sFFT1.0 algorithm. After bucketization, the algorithm then focuses on the non-empty buckets and computes the positions and values of the significant frequency coefficients in those buckets in what we call the spectrum reconstruction or identifying frequencies. As we can see as follows, more than thirty algorithms are using the sFFT idea, and more than ten sFFT algorithms are using the flat filter. A central question now is how to analyze and evaluate the performance of these algorithms for computing signals by the compare of themselves or other types of algorithms. It should be proved whether the runtime complexity, sampling complexity, and robustness performance are consistent with the theory or not. Are there any better ways to improve these algorithms when using it in practice? The results of these performance analyses are the guide for us to optimize these algorithms and use them correctly in different areas.
The first sFFT algorithm [1] with sub-linear runtime and sub-sampling property is a randomized algorithm with runtime and sampling complexity O(K 2 poly(logN )). It was later improved to O(Kpoly(logN )) [2], [3] through the use of binary search technique for spectrum reconstruction and the use of unequally-spaced FFTs. The algorithm is the so-called Ann Arbor fast Fourier Transform (AAFFT); the versions of them are AAFFT0.5 and AAFFT0.9.
The sFFT algorithm so-called Fast Fourier Aliasing-based Sparse Transform(FFAST) [4], [5], which focuses on exactly K -sparse signals, is an efficient algorithm. Its approach is based on the downsampling of the input signal using a constant number of co-prime downsampling factors guided by the Chinese Remainder Theorem(CRT). These aliasing patterns of different downsampled signals are formulated as parity-check constraints of useful erasure-correcting sparsegraph codes. The FFAST algorithm costs O(K logK ) to compute the exact signals and only use O(K ) samples. The researcher adopted the FFAST framework to the case that is corrupted by white Gaussian noise. The author showed that the extended noise-robust algorithm R-FFAST [6], [7] computes the DFT using O(K logK ) samples in O(K log 4 N ) runtime. These two algorithms perform well when N is a product of some smaller prime numbers.
The new algorithm so-called sFFT by downsampling in the time domain(sFFT-DT) [8] is proposed in the advantage of the aliasing filter. The idea behind sFFT-DT is to downsample the original input signal first, and then all subsequent operations are conducted on the downsampled signals. To overcome the aliasing problem; the author considers the locations and values of K non-zero entries as variables and the aliasing problem is found to be equivalent to the moment-preserving problem(MPP), which can be solved via orthogonal polynomials or syndrome decoding with compressive sensing(CS) based solver. The deterministic algorithm so-called Gopher Fast Fourier Transform(GFFT) [9], which based on the CRT, is an aliasing-based search algorithm. The approximation error bounds in [9] are further improved in [10]. Later, an algorithm so-called Christlieb Lawlor Wang Sparse Fourier Transform(CLW-SFT), which used the phase encoding method, was given in [11], [12]. The noiseless version of this algorithm is an adaptive algorithm [12], which has runtime O(K logK ). The author developed this algorithm [11] by using the multiscale error-correcting method to cope with highlevel noise with runtime O(K 2 logK ). The author evaluated the performance [13] of DMSFT (generated from GFFT) and CLW-DSFT (generated from CLW-SFT) and compared their runtime and robustness characteristics with other algorithms. These four algorithms all have a hypothesis that the algorithms can sample anywhere they want.
The sFFT algorithms using the flat window filter socalled sFFT1.0-sFFT4.0 [14], [15] can compute the exactly K -sparse signals in time O(K logN ) and the general K -sparse signals in time O(K logN log(N /K )). These algorithms leverage characteristic of the flat filter. The sFFT1.0 and sFFT2.0 algorithms can identify and estimate the K largest coefficients in one shot. The sFFT3.0 algorithm can estimate the position by using only two samples of the filtered signal inspired by the frequency offset estimation in the exactly sparse case. Later, a new robust algorithm so-called Matrix Pencil FFT(MPFFT) [16] was proposed on the basis of the sFFT3.0 algorithm. The major new ingredient is a mode collision detector based on the matrix pencil method. The method enables the algorithm to use fewer samples of the input signal.
The paper [17] proposes an overview of sFFT technology and summarizes a three-step approach in the stage of spectrum reconstruction and provides a standard testing platform that can be used to evaluate different sFFT algorithms. There are also some researches try to conquer the sFFT problem from a lot of aspects: computational complexity [18], [19], performance of the algorithm [20], [21], software [22], [23], higher dimensions [24], [25], implementation [26], hardware [27] and special setting [28], [29] perspectives.
The identification of different sFFT algorithms can be known through a brief analysis as above. The Dirichlet kernel filter is not efficient because it only bins some frequency coefficients into one bucket one time. As to the aliasing filter, it is difficult to solve the worst case because there may be VOLUME 8, 2020 many frequency coefficients in the same bucket accidentally if B only can be supposed as a power of two because the scaling operation is of no use. In comparison to them, using the flat filter is very convenient and efficient. This paper is structured as follows. Section II and Section III provide a brief overview of the sFFT technique. Section IV introduces and analyzes two frameworks and five spectrum reconstruction methods of five algorithms. In the one-shot framework, the sFFT1.0 and sFFT2.0 algorithm use the voting method with the help of the stochastic characteristics. In the iterative framework, the sFFT3.0 and sFFT4.0 algorithm use the phase encoding method with the help of the time shift characteristics, and the MPSFT algorithm uses the matrix pencil method with the help of the Prony model. In section V, we do three categories of comparison experiments. The first kind of experiment is to compare them with each other. The second is to compare them with other sFFT algorithms. The third is to compare them with optimization to them without optimization. The analysis of the experiments satisfies theoretical inference.

II. NOTATION
In this section, we initially present some notation and basic definitions of sFFT. We use ω N = e −2πi/N as the N -th root of unify. Let F N ∈ C N ×N be the DFT matrix of size N defined as follows: The DFT of a vector x ∈ C N (consider a signal of size N , where N is a power of two) is a vectorx ∈ C N defined as follows:x It is necessary to consider the inverse of the DFT matrix above. F −1 N ∈ C N ×N defined as follows: The inverse DFT ofx is a vector x defined as follows: For x −i = x N −i , we may define convolution as follows: For coordinate-wise product (xy) i = x i y i and the DFT of xy is performed as described in Equation 8: For exact signals,x is exactly K -sparse if it has exactly K non-zero frequency coefficients while the remaining N − K coefficients are zero. For general signals,x is general K -sparse if the largest K frequency coefficients remaining N − K coefficients. The goal of the sFFT is to recover a K -sparse approximationx by finding frequency positions f and estimating valuesx f of the K largest coefficients.

III. TECHNIQUES
In this section, we start with an overview of the techniques that we will use in the sFFT.

A. RANDOM SPECTRUM PERMUTATION
The random permutation includes two operations; one is shift operation, another is scaling operation. Let τ ∈ R be the offset parameter. Let matrix S τ ∈ R N ×N representing the shift operation, is defined as follows: Let σ ∈ R be the scaling parameter. Let matrix P σ ∈ R N ×N representing the scaling operation, is defined as follows: Suppose σ −1 ∈ R exists mod N , σ −1 satisfies σ −1 σ ≡ 1(modN ). If a vector x ∈ C N , x = S τ P σ x, such that: The random permutation isolates spectral components from each other, and it is performed as follows: if x = S τ P σ x, such that:x

B. WINDOW FUNCTION
The window function is a mathematical tool and can be seen as a matrix multiply the original signal. We introduce three filters used in the sFFT algorithm mentioned in this paper. The first filter is the frequency aliasing filter. Through the filter, the signal in the time domain is subsampled such that the corresponding signal in the frequency domain is aliased. Let L ∈ Z + be the subsampling factor. Let B ∈ Z + be the subsampling number. Let matrix D L ∈ R B×N representing the subsampling operation, is defined as follows: Let vector y L,τ ,ŷ L,τ ∈ C B be the filtered signal obtained by shift operation and aliasing filter. If y L,τ = D L S τ x, (14). If τ = 0 we get formula (15).
The second filter is the frequency flat filter. We use a filter vector G that is concentrated both in time and frequency domain, G is zero except at a small number of time coordinates with supp(G) ⊆ [−w/2, w/2] and its Fourier Transform G is negligible except at a small fraction L (≈ εN ) of the frequency coordinates (the pass region). The paper [14] claim there exists a standard window function G(ε, ε , δ, w) satisfies the formula (16). The filter can be obtained by convoluted a Gaussian function with a boxcar window function and supp(G) = w = O(1/ε log(1/δ))). One can potentially use a Dolph-Chebyshev window function with minimal big-Oh constant. In this paper, we use filter G ∈ C N be an (L/N , L/2N , δ, w) flat window. The width of the filter in the time domain is denoted by w, the width of the passband region in the frequency domain is denoted by L, the number of buckets is denoted by B and B = N /L.
Let matrix Q L ∈ C N ×N be a diagonal matrix whose diagonal entries represent filter coefficients in the time domain, is defined as follows: The third filter is the frequency subsampled filter. Through the filter, the signal in the time domain is aliased such that the corresponding signal in the frequency domain is subsampled. Let matrix U L ∈ R B×N represents the aliasing operator as follows: Let vector y L ,ŷ L ∈ C B , be the filtered signal obtained by the subsampled filter. If

C. FREQUENCY BUCKETIZATION
The process of bucketization in this paper is achieved through the use of the flat filter, the subsampled filter, shift operation and scaling operation. It can be equivalent to the signal multiply F B U L Q L S τ P σ . The filtered signal is performed as follows: Proof: Based on the above-mentioned properties we get formula (20) If the set I is a set of coordinates position, the position f = (σ −1 u)modN ∈ I , suppose there is no hash collision in the bucket i, i = round(u/L), round() means to make decimals rounded. Through formula (20), we can get the formula(21) As we see above, frequency bucketization includes three steps: random spectrum permutation(x = S τ P σ x, it cost 0 runtime), flat window filter(x = Q L x , it cost w runtime and w samples), Fourier Transform of the aliasing signal(ŷ L,τ,σ = F B U L x , it cost BlogB runtime and 0 samples). So totally frequency bucketization one round cost w + BlogB runtime and w samples.

IV. ALGORITHMS ANALYSIS
As mentioned above the goal of frequency bucketization is to decrease runtime and sampling complexity in the advantage of low dimensions; after bucketization the filtered signal y L,τ,σ can be obtained by original signal x. In this section, we introduce two frameworks, five methods and corresponding algorithms to recover the spectrumx of the filtered signal y L,τ,σ in their own way.
A. THE sFFT1.0 ALGORITHM BY THE ONE-SHOT FRAMEWORK The first framework can directly reconstruct the spectrum by one-shot does not need iteration. The process to reconstruct the spectrum of the sFFT1.0 algorithm includes two kinds of rounds, the first is location round and another is estimation round. Every location round one time generates a list of candidate coordinates I r . Candidate coordinates i ∈ I r have a certain probability of being indices of one of the K significant coefficients in spectrum. By running multiple rounds, this probability can be increased, so it is certain to vote the candidate coordinates with a high probability after R(≈ logN ) times' rounds. The next step is to do estimation rounds used to exactly determine the value of identified frequencyx f isolated in the bucket in the reason of the value of the bucket is approximate the frequency that identified in the bucket if there is no hash collision. The block diagram of the sFFT algorithms system of the one-shot framework is shown in Figure 1. We explain the details as follows. Stage1 Bucketization: Run R times' round for set Stage2-Step1 Location rounds: After R times' round, return R sets of coordinates I 1 , · · · I R (set I r representing a union of 2K sets J from B sets J , J ∈ J r,0 , J r,1 , · · · J r,B−1 in the No.r' round, set J r,i = ). Then do the vote, count the number s i of occurrences of each found coordinate i, that is: s i = {r|i ∈ I r } 0 ( 0 representing 0 −norm). Only keep the coordinates occurred in at least fifty percentage proportion(I = {i ∈ I 1 ∪ · · · ∪ I R |s i > R/2}).
Stage2-Step2 Estimation rounds: After location rounds, the set I can be obtained then estimate R sets of frequency coefficientsx 1 , · · ·x R . The method is if position f ∈ I , we can get the value of position f through formula (21). For identified position f , R differentx r f can be obtained in R times' round, finally use the median value of the sets as the final estimator.
Finally, we analyse the performance of the sFFT1.0 algorithm. In stage1 it cost R(w + BlogB) runtime, in stage2- In stage 1 it needs w samples one time. In the first round, the signal not chosen is in the probability of (N − w)/N ; suppose the probability does not change; on average the samples chosen after R times' round is in the number of

B. THE sFFT2.0 ALGORITHM BY THE ONE-SHOT FRAMEWORK
As is shown in Figure 1 Compared to the sFFT1.0 algorithm, the runtime the sFFT2. Compared to the one-shot framework, the iterative framework has two improvements. The first advantage of the iterative framework is that once a frequency coefficient of the signal was found and estimated, it can be subtracted from the signal. This fact can be used to reduce the amount of work to be done in subsequent steps. It is not necessary to update the whole input signal. Instead, it is sufficient to update the B-dimensional buckets. This way, the removal of the effects of already found coefficients can be done in O(B) time. The second important improvement in the iterative framework is an improved method for finding the signal's significant frequency coordinates rather than the voting method by R times' rounds. In the one-shot framework, R(≈ log N ) rounds are run and their results combined in order to get correct locations at a high probability. In the iterative algorithms, two or log 2 L rounds is enough in their own ways.
In the No.m' iteration, let K m be the expected sparsity, R m be how many rounds in the No.m' iteration, B m be the number of buckets, L m be the size of one bucket, w m be the support of filter G,ŷ L,τ,σ be filtered spectrum,ŷ update be the spectrum have already gained,x m−1 be the last result, y L,τ,σ be the spectrum need to recover,x m be the recovered spectrum,x m be the new result, set τ = {τ 1 , τ 2 , · · · τ R } and set σ = {σ 1 , σ 2 , · · · σ R } be the parameter. It can be seen that R m = 2 in the sFFT3.0 algorithm, R m = log l L m in the sFFT4.0 algorithm, Rm = log 2 L m in the MPSFT algorithm. The detailed course in No.m' iteration is shown in Figure 2 and explained as follows.
Step1: Run R m bucketization rounds for K m , B m , L m , set σ and set τ to calculateŷ L,τ,σ = F B U L Q L S τ P σ x representing the filtered spectrum.
Step5:x m =x m−1 +x m representing the result of this iteration.
Step6: If it is the last iteration, the final result isx m , otherwisex m will be the input to makeŷ update in the next iteration.
It is sufficient to locate the position only using R(=2) rounds instead of R(≈ log N ) rounds by the phase encoding method in the sFFT3.0 algorithm in the exactly sparse case. The process is in the first round we set τ 1 = 0, and the second round we set τ 2 = 1, then suppose in the bucket i, it contains only one large frequency, so we getŷ L,0, Proof: In the first iteration, suppose w 1 = B 1 log(N /δ), K 1 = K , it cost 2(w 1 +B 1 log B 1 +K 1 ) = O(B 1 log N ) runtime and find at least K /2 true frequency, In the second iteration, suppose B 2 = B 1 /2, w 2 = B 2 log(N /δ), K 2 = K /2, it cost 2(w 2 + B 2 log B 2 ) runtime in the step1, it cost 2K 1 runtime in the step2, it cost 2B 2 runtime in the step3, it cost 2K 2 runtime in the step4, it cost K 2 runtime in the step5, it total cost O(w 2 + B 2 log B 2 + K 1 ) < O(B 1 log N )/2 runtime in the second iteration, so the total runtime is O(  sFFT4.0 algorithm, it is sufficient to locate the position only running R(= log l L) times' round by the multiscale phase encoding method in advance of it satisfies Lemma 6.(let l be the multiscale parameter).
Lemma 5: In the bucket i, suppose the located position is denoted by u , the real position is denoted by u, the noise is denoted by S τ [i], it satisfied formula (22), so S 0 [i] is the noise in the bucket i for τ = 0, S 1 [i] is the noise in the bucket i for τ = 1, function (θ ) satisfies e (θ )i = θ, (θ ) ∈ [0, 2π ). In the sFFT3.0 algorithm, the algorithm guarantee must be required as formula (23).
π N Lemma 6: In the bucket i, suppose L be the range in this location, l be the multiscale parameter, r be the size of one scale(r = L/l), u 0 be the initial position, u l be the located value from located position u (u l = (u − u 0 )/r, u l ∈ [0, l]), u l be the real value from real position u(u l = (u−u 0 )/r, u l ∈ [0, l]), τ 1 = 0, τ 2 ≈ N /L, In the sFFT4.0 algorithm, the algorithm guarantee must be required as formula (24), Proof: It is clear that the restrictive conditions of formula(23) are very harsh in the sFFT3.0 algorithm when N is large, so the sFFT3.0 algorithm is not robustness. From the lemma6, it is most robust when we use the binary search (l = 2), because the confidence upper limit(π/2) is very big, but it is not effective. We want to know how to set the confidence upper limit of multiscale parameter l, we do it by Monte Carlo experiment, is denoted by (θ ), we do two categories experiments to calculate the logarithm of the error of phase log 10 ( (θ )) and computing probability distribution function(PDF) of the value log 10 ( (θ )) in all valuable buckets and all rounds by the input signals of different N under different signal noise ratio(SNR) circumstances only if the bucket is not aliasing, then we get Figure 3. From Figure 3, we can see with the development of SNR(from the red space to purple space), the probability of small error increases. Compare of the two cases: small N and big N under the condition of the same K and same SNR, if N is big, it means in one bucket the noisy has less energy compared with the effective signal, it can be concluded both in theory and in experiments that the variance of the error becomes smaller when N is big. From Figure 3, if we want to keep the probability greater than 0.99 under the condition of SNR = −20(red space), the threshold should be more than 10 0.5 ≈ 3.2, it seems impossible. Under the condition of SNR = −10(blue space), the threshold should be more than 10 0 ≈ 1; it can be solved by the binary search method because the confidence upper limit is π/2 ≈ 1.7. Under the condition of SNR=0(yellow space), the threshold should be more than 10 −0.5 ≈ 0.3, and under the condition of SNR>0, it is certain to keep the high probability if the threshold is more than 0.3. Under the condition of SNR=120(purple space), if the threshold is π/N = π/8192 ≈ 0.004 > 10 −3 , or the threshold is π/N = π/1048576 ≈ 0.000003 > 10 −6 , it can also keep the high probability to satisfy the formula(23)(Remarks: It is easy to know the PDF of the error of phase will not change much with different τ and different σ ).
In the real sFFT4.0 algorithm, we use l =8; the confidence upper limit is π/8 ≈ 0.4, it can keep the high probability to satisfy formula (24) under the condition of SNR≥0. As to the runtime complexity, we can easy to know it should run log l L(= log 8 (N /B)) times' rounds instead of two times' rounds in every iteration, so the runtime and sampling complexity is O(K log N log 8 (N /K )).

E. THE MPSFT ALGORITHM BY THE ITERATIVE FRAMEWORK
Compared with the sFFT4.0 algorithm, only the discriminant equation to the location of the MPSFT algorithm is different. The matrix pencil method, like the Prony method, is a standard technique in signal processing for mode frequency identification. In this section, we use the matrix pencil method into the MPSFT algorithm to achieve two effects. Firstly, it identifies modes much more accurately. Secondly, it helps detect errors in our mode identification step and greatly reduces the number of spurious modes being found. Rely on these; we can use only a little cost to solve the collision problem.
Suppose the number of significant frequencies in the bucket i is denoted by a. In most buckets a = 0, in a part of buckets a = 1, only in a small part of buckets a >= 2. Then the formula(21) can be translated to the formula (25), the problem to reconstruct spectrum is translated to how to calculate 2a variables as follows: a amplitudes(Ĝ poly(f 0 )xf 0 · · ·Ĝ poly(f a−1 )xf a−1 ) and a positions(ω f 0 σ , · · · ω f a−1 σ ). It needs 2a equations, whereŷ L,τ,σ [i] is known and denoted by m τ =ŷ L,τ,σ [i] using fixed L and σ , p j representing unknownĜ iL−σ f jx f j , z j representing unknown ω σ f j N . By taking the above into consideration, the problem can be formulated by BCH codes as formula(26) by using τ = 0, 1, · · · , 2a − 1.
· · · · · · · · · · · · z 2a−1 m 0 m 1 · · · m a m −1 m 1 m 2 · · · m a m · · · · · · · · · · · · m a m −1 m a m · · · m 2a m −1     a m ×a m (27) Suppose there are at most a m significant frequencies in the bucket. By singular value decomposition(SVD) of the matrix M a m defined as for formula (27) in bucket i, we obtain a m singular values for each frequency. For example, we can set a m equal to two, so we obtain two singular values by SVD. If two singular values are both small, it means there is no significant frequency. If there is only one big singular value, it means there is one significant frequency. If both of them are big, it means there are more than one significant frequencies. The way to solve the collision problem is as above, as to distinguish the position of the frequency, the method is very similar to the sFFT4.0 algorithm using binary search and the discriminant is inspired by the matrix pencil method. The detail can see [16]. It is sufficient to locate the position and estimate the value of frequencies only using R(= 2 log 2 L) times' rounds in the MPSFT algorithm, the runtime and sampling complexity of the MPSFT algorithm is approximately equal to O(K log N log 2 (N /K )).
After analyzing two types of frameworks, five different methods to reconstruct spectrum and corresponding algorithms, Table 1 1 can be concluded with the additional information of other sFFT algorithms and fftw algorithm.
From Table 1, we can see the sFFT3.0 algorithm has the lowest runtime and sampling complexity, but it is nonrobustness. Other algorithms using the flat window are good robustness but compare them with other sFFT algorithms it is no advantage in the sampling complexity except the sFFT4.0 algorithm.

V. EXPERIMENTAL EVALUATION
In this section we evaluate the performance of five sFFT algorithms using the flat window filter: sFFT1.0, sFFT2.0, sFFT3.0, sFFT4.0 and MPSFT algorithm. All of them are implemented in C or C++ language to empirically evaluate their runtime characteristics. We firstly compare these algorithms' runtime, percentage of the signal sampled and robustness characteristics with each other. Then we compare these algorithms' characteristics with other algorithms: fftw, sFFT-DT, FFAST and AAFFT algorithm. Finally, we compare these algorithms' runtime characteristics with themselves optimized. All experiments are run on a CentOS7.6 computer with 4 Intel(R) Core(TM) i5-4570 3.20GHz CPU, a cache size of 6144 KB and 8 GB of RAM.

A. EXPERIMENTAL SETUP
In the experiment, the test signals are gained in a manner that K frequencies are selected from N frequencies uniformly at random and assigned a magnitude of 1 and a uniformly random phase and the rest frequencies are set to zero in the exact case. When in the general sparse case, the test signals are gained similarly but they are combined with additive white Gaussian noise, whose variance varies depending on the SNR required. Each point in the figure is the average result over 5 runs with 5 different instances as desired. The parameters of these algorithms are chosen so that can make a balance between time efficiency and robustness. 1 The performance of algorithms using the flat window is got as above. The performance of other algorithms is got from [2], [3], [5], [6], [8]. The analysis of robustness will be explained in the next section.

B. COMPARISON EXPERIMENT ABOUT DIFFERENT ALGORITHMS USING THE FLAT FILTER OF THEMSELVES
We plot Figure 4 representing runtime vs signal size and vs signal sparsity for sFFT1.0, sFFT2.0, sFFT3.0, sFFT4.0 and MPSFT algorithm in the exactly sparse case. 2 As mentioned above, the runtime is determined by two factors. One is how many rounds(it manly depend on R) and how much time cost in one round(it manly depend on w). So from Figure 4 we can see 1)The runtime of these five algorithms are approximately linear in the log scale as a function of N and in the standard scale as a function of K . The reason is R and w is with the growth of log N and K . 2)Results of ranking the runtime complexity of five algorithms is sFFT3.0 > sFFT4.0 > sFFT2.0 > sFFT1.0 > MPSFT. The reason is their individual's R is about 2, 2 log 8 L, approximate log N , log N and 2 log 2 L. We plot Figure 5 representing the percentage of the signal sampled vs signal size and vs signal sparsity for sFFT1.0, sFFT2.0, sFFT3.0, sFFT4.0, and MPSFT algorithm in the exactly sparse case. 3 As mentioned above, the percentage of the signal sampled is also determined by two factors: how many rounds and how many samples sampled in one round. So from Figure 5 we can see 1)The percentage of the signal sampled of these five algorithms are approximately linear in the log scale as a function of N and in the standard scale as a function of K . 2)Results of ranking the sampling complexity of five algorithms is sFFT3.0 > sFFT4.0 > MPSFT > sFFT2.0 > sFFT1.0 because of the different R.
We plot Figure 6 representing the runtime and L1-error vs SNR for sFFT1.0, sFFT2.0, sFFT3.0, sFFT4.0, and MPSFT algorithm. 4 From Figure 6 we can see 1)The runtime is approximately equal vs SNR. 2) To a certain extent, these four algorithms are all robustness, but when SNR is low, only MPSFT satisfies the ensure of robustness. When SNR is medium, sFFT1.0 and sFFT2.0 can also meet the ensure of robustness. And only when SNR is bigger than 20db, sFFT4.0 3 The general sparse case is very similar to the exactly sparse case except the sFFT3.0 algorithm. 4 the L0-error, L1-error L2-error of all experiments can be provided in https://github.com/zkjiang/-/tree/master/docs/sfft project /experiment data can deal with noise interference. The reason is that the way of binary search is better than voting method under the large noisy situation. And the way of multiscale search is not good when it use in noisy situation according to the formula (26) and Figure 3.

C. COMPARISON EXPERIMENT ABOUT ALGORITHMS USING THE FLAT FILTER AND OTHER ALGORITHMS
We plot Figure 7 representing run times vs signal size and vs signal sparsity for sFFT1.0, sFFT4.0, AAFFT, R-FFAST, SFFT-DT and fftw algorithm in the general sparse case. 5 From Figure 7, we can see 1)These algorithms are approximately linear in the log scale as a function of N except the fftw algorithm. These algorithms are approximately linear in the standard scale as a function of K except the fftw and SFFT-DT algorithm. 2) Results of ranking the runtime complexity of these six algorithms is sFFT4.0 > sFFT1.0 > AAFFT > SFFT-DT > fftw > R-FFAST when N is large. The reason is the Least Absolute Shrinkage and Selection Operator(LASSO) method used in the R-FFAST algorithm costs a lot of time. And the SVD and CS method used in the SFFT-DT also cost a lot of time. 2)Results of ranking the runtime complexity of these six algorithms is fftw > SFFT-DT > sFFT4.0 > sFFT1.0 > AAFFT > R-FFAST when K is large. The reason is algorithms using the aliasing filter saving the time by using a small number of buckets in the first stage compared to algorithms using the flat filter when K is large.
We plot Figure 8 representing the percentage of the signal sampled vs signal size and vs signal sparsity for sFFT1.0, sFFT4.0, AAFFT, R-FFAST, SFFT-DT and fftw algorithm in the general sparse case. 6 From Figure 8, we can see 1)These algorithms are approximately linear in the log scale as a function of N except the fftw and SFFT-DT algorithm. The reason is sampling in low-dimension in sFFT algorithms can 6 The exactly sparse case is very similar including the sFFT3.0 algorithm. The FFAST and R-FFAST algorithms are not available when K is very large. It is a limit to the size of bucket in the SFFT-DT algorithm(L is not allowed more than 1024). decrease sampling complexity, and it is a limit to the size of the bucket in the SFFT-DT algorithm by using the CS and SVD method. These algorithms are approximately linear in the standard scale as a function of K except the R-FFAST and SFFT-DT algorithm. The reason is algorithms using the aliasing filter saving the time by using less number of buckets. 2)Results of ranking the sampling complexity of these six algorithms is R-FFAST > sFFT4.0 > AAFFT > SFFT-DT > sFFT4.0 > fftw when N is large. 2)Results of ranking the sampling complexity is SFFT-DT > sFFT4.0 > AAFFT > sFFT1.0 > fftw when K is large. The reason is that algorithms using the aliasing filter need less buckets than other algorithms, the number of buckets they use in the R-FFAST algorithm is only connected to the prime numbers gained by N and the number of buckets they use in the SFFT-DT algorithm is only connected to limit to the size of the bucket.
We plot Figure 9 representing runtime and L1-error vs SNR for sFFT1.0, sFFT4.0, AAFFT, SFFT-DT and fftw algorithm. From Figure 9 we can see 1)the runtime is approximately equal vs SNR. 2)To a certain extent, these five algorithms are all robustness, but when SNR is low, only the fftw algorithm satisfies the ensure of robustness. When SNR is medium, the sFFT1.0, AAFFT and SFFT-DT algorithm can also meet the ensure of robustness. And only when SNR is bigger than 20db, the sFFT4.0 algorithm can deal with the noise interference.

D. COMPARISON EXPERIMENT ABOUT THE SAME ALGORITHM WITH OPTIMIZATION AND WITHOUT OPTIMIZATION
We plot Figure 10 7 representing runtime vs signal size and vs signal sparsity for sFFT1.0-mit, sFFT2.0-mit, sFFT1.0-eth and sFFT2.0-eth algorithm in the general sparse case. From Figure 10, we can see the runtime of the same algorithm is accelerated a lot by the use of software optimization.

VI. CONCLUSION
In the first part, the paper introduces the techniques used in sFFT algorithms including random spectrum permutation, window function and frequency bucketization. In the second part, we analyze five typical algorithms using the flat filter in detail including the sFFT1.0 algorithm using the voting method, the sFFT2.0 algorithm using the heuristic voting method by one-shot framework and the sFFT3.0 algorithm using the phase encoding method, the sFFT4.0 algorithm using the multiscale phase encoding method, the MPSFT algorithm using the matrix pencil method by the iterative framework. We get the conclusion of the performance of these five algorithms including runtime complexity, sampling complexity and robustness in theory in Table 1. In the third part, we make three categories of experiments for computing the signals of different SNR, different N , and different K by a standard testing platform through nine different sFFT algorithms and record the runtime, the percentage of the signal sampled and L0, L1, L2 error in every different situation both in the exactly sparse case and general sparse case. The analyse of the experiments satisfies theoretical inference. VOLUME 8, 2020 The main contribution of this paper is 1)develop a standard testing platform which can test a lot of typical sFFT algorithms in different situations on the basis of the old platform. 2)get a conclusion of the character and performance of five typical sFFT algorithms using the flat window filter: the sFFT1.0 algorithm, the sFFT2.0 algorithm, the sFFT3.0 algorithm, the sFFT4.0 algorithm, and the MPSFT algorithm in theory and practice.