Window Functions with Minimum-Sidelobe Derivatives for Computing Instantaneous Frequency

In the field of nonstationary signal analysis and processing, the short-time Fourier transform (STFT) is frequently used to convert signals into the time-frequency domain. The instantaneous frequency (IF) of the STFT is defined as the time derivative of the STFT phase and plays an important role in the reassignment method and the synchrosqueezing transform (SST). In this paper, we propose a framework to design a window function for computing the IF of STFT. Computing the IF requires the STFT with a window and STFT with its derivative, i.e., the IF computation depends on both the window function and its derivative. To design a window suitable for computing the IF, we formulate the window design problem as a sidelobe minimization problem of the corresponding derivative. Two windows are designed considering the sidelobe energy or the highest sidelobe level as cost functions for minimizing the sidelobes of their derivatives. The SST using the proposed window provides a sharper time-frequency representation compared to those produced using ordinary bandwidth-adjustable windows.


I. INTRODUCTION
T IME-frequency (T-F) representations are highly important in nonstationary signal analysis and processing. The short-time Fourier transform (STFT) is widely used to convert a signal into the T-F domain, owing to its simplicity and well-understood structure [1]- [4]. The resolution of a T-F representation obtained by the STFT is limited in Heisenberg's uncertainty principle. To adjust its resolution, many window functions have been proposed from various viewpoints, such as frequency responses [5]- [14] and numerical stability in signal processing [15]- [20].
The reassignment method and the synchrosqueezing transform (SST) have been proposed to overcome Heisenberg's uncertainty principle [21]. The reassignment method was first proposed by Kodera to improve the readability of the T-F representation obtained by the STFT [22]. Then, Auger and Flandrin popularized the reassignment method by discovering an efficient computational method; furthermore, they generalized the reassignment method to T-F representations in Cohen's class and time-scale representations [23]. Then, the reassignment method was generalized for any filterbank [24], [25]. The reassignment method sharpens a T-F representation using the time and frequency derivative of its phase at the expense of invertibility. The time and frequency derivatives of the STFT phase are referred to as the instantaneous frequency (IF) and group delay, respectively.
In the context of audio signal analysis, Daubechies and Maes proposed the SST [26], [27], which is a variant of the reassignment method. The SST performs frequencyonly reassignment to a complex-valued T-F representation to sharpen the T-F representation while ensuring invertibility. Subsequently, the SST was also generalized for STFT [28], [29] and other representations [25], [30] and has been widely studied [31]- [35]. The FSST reassigns the spread components using the IF, which is affected by the window function. Therefore, to improve the performance of the FSST, the window should be designed considering computing the IF.
Moreover, IF has also been employed in other applications, such as phase vocoder [36]- [39], T-F mask estimation [40], VOLUME x, xxxx [41], phase conversion [42]- [45], and speech processing [46]- [48]. In the context of a phase vocoder, the effect of a window on the IF computation has been demonstrated by comparing several existing windows [38]. Therefore, designing a window for computing the IF can also improve the performance of their applications.
A method to compute the IF uses the STFT with a window and with its (time-)derivative [23], which can compute accurately even in the discrete setting. That is, the computed IF depends on both the window and its derivative. Furthermore, interference of multiple signal components influences the IF computation. To reduce the interference of multiple signal components, the window and its derivative should be designed to reduce the T-F spreading under Heisenberg's uncertainty.
The main purpose of a window design for reducing spreading is its frequency response because the spread in the time direction can be controlled relatively easily by the support of the window. Hence, the sidelobes of the frequency responses of the window and its derivative need to be reduced. In particular, the sidelobes of the window derivative should be given more attention since the differential operator emphasizes high-frequency components. Several window functions are designed by considering the continuity at their edges, which is related to the sidelobe of their derivatives [13], [14]. However, no method has explicitly considered the frequency response of the window derivative. Designing a window function to minimize the sidelobes of the frequency response of its derivative is expected to obtain a window function that is more suitable for IF computation. Therefore, in this paper, we propose a framework for designing a window function for IF computation. The proposed method first designs the window derivative to minimize the sidelobes and then estimates the window function from the designed window derivative. The designed windows are evaluated by the IF computation and an application to the FSST.
This paper aims to expand on our five-page conference paper [49], in which we proposed a window function that is designed to minimize the sidelobe energy of its derivative and validated its performance in limited experiments. The contribution of this paper is summarized as follows: • A detailed explanation about computing IF in continuous and discrete STFT (Sec. II). • An additional proposed window is designed to minimize the highest sidelobe level of its derivative (Sec. IV-C). • A proposed window estimation method from the window derivative for minimizing the truncation effect (Sec. IV-D). • Detailed evaluations of the proposed windows in terms of computing IF (Sec. VI). • Quantitative evaluations of FSST using the Rényi entropy and the Earth mover's distance (Sec. VII). The rest of this paper is organized as follows. Sections II and III introduce the IF of the STFT and bandwidthadjustable windows, respectively. Then, Section IV explains our proposed method for designing windows. Section V (n, m)th element of a matrix X (·) Complex conjugate (·) T Transpose (·) * Conjugate transpose x, y Inner product of x, y ∈ C L ; x, y = y * x · Euclidean norm 1 N N -dimensional vector whose elements are all one Diagonal matrix whose diagonal elements are x x Round function that rounds x to the nearest integer x Floor function presents the frequency responses of windows designed by the proposed method. Section VI provides numerical experiments to evaluate the performance of the designed windows in terms of computing IF. Section VII presents the performance of applying the designed window to the FSST, and the conclusion of this paper is presented in Section VIII.

A. NOTATION
The list of notations and abbreviations used in this paper are summarized in Tables 1 and 2, respectively. We define the frequency response of a discrete signal f ∈ C L as where i is the imaginary unit. The discrete Fourier transform (DFT) F : and its inverse F −1 : C L → C L is given by

B. COMPUTING IF USING STFT
The STFT of a signal f ∈ L 2 (R) with a window function g ∈ L 2 (R) ∩ C 1 (R) is defined as where t ∈ R and ω ∈ R represent time and frequency, respectively. The STFT can be represented as where M f g := |V g f | is the STFT magnitude, and Φ f g is the STFT phase. The IF of the STFT is defined as the time derivative of the phase [50], [51], For instance, let us consider the IF of a continuous sinusoid The STFT of s(t) is explicitly expressed as whereĝ(ω) is the Fourier transform of g(t). Then, its phase is given by Consequently, the IF of s(t) is calculated as This corresponds to the difference between the frequency of the sinusoid ξ s and the frequency axis ω, which allows us to observe detailed frequency information from the spread T-F representation obtained by the STFT. Some studies refer to it as the relative instantaneous frequency [52] 1 . The straightforward approach for computing IF f g from (6) in the discrete setting is an approximation of the time derivative of the phase by finite differences. However, this suffers from the phase unwrapping problem [53]. Avoiding such problems, [23] proposed an alternative expression of the IF given by where g = dg/dt is the derivative of the window g. This is derived from the following calculation using the chain rule: In addition, the derivative of the STFT with respect to time can be rewritten as According to (11), computing the IF requires the STFT using the window function g and its derivative g . Hence, the computed IF depends on both windows.

C. COMPUTING IF IN DISCRETE STFT
The discrete version of the STFT for a discrete signal f ∈ C L with a discrete window function g ∈ C L is written as where n = 0, 1, . . . , L − 1 is the time-shift index and As in (11), the IF of the discrete STFT can be computed by the window g and its spectral derivative g = Dg, The spectral differentiation operator D is defined as where The rationale for using the spectral derivative as a counterpart of the continuous-time derivative is given by [24] 2 .
An example of a window function and its derivative obtained by spectral differentiation is illustrated in Fig. 1. In Fig. 1, the signal length is L = 18, and the window is STFT requires computing the DFT of the entire signal, which obstructs its application to real-time processing or lengthy signal analysis [54]- [56]. In practice, the window derivative is truncated to have the same support as the original window.

D. COMPARISON OF SPECTRAL AND ANALYTICAL DERIVATIVES
When the window function is originally defined as a differentiable continuous function, another possible method for calculating the window derivative instead of spectral differentiation is to differentiate analytically and sample the derivative. Let us illustrate this method using the Hann window, which is one of the most famous of such window functions. The Hann window supported on [0, N − 1] is defined as Its analytical derivative is given by Here, we compare the spectral and analytical derivatives in terms of the computed IF of a sinusoid. Considering a discrete complex sinusoid for l = 0, 1, . . . , L − 1, we evaluated the error between the analytical IF of the sinusoid (10) and the computed IF With the same manipulations in (8), the STFT of sinusoid s is calculated as Then, the IF of sinusoid s is calculated as which is independent of the time index n. When g is realvalued, (24) is simplified as Thus, the error (22) is rewritten as Fig. 2 shows the frequency responses of the window derivatives and errors of IF computation using these window derivatives. The window length and the signal length were set to N = 2 7 and L = 2 12 . The mainlobe width ω MW is defined as the first null point of the frequency response except for ω = 0. The error in the IF computed by the analytical derivative is larger than that of the spectral derivative. Therefore, this paper focuses on spectral differentiation, even if these windows are analytically differentiable.

III. BANDWIDTH-ADJUSTABLE WINDOWS
One aim of window function design is to control T-F spreading under Heisenberg's uncertainty principle. The spread in the time direction can be controlled by setting the length of the window function. We assume that a window g is supported on [0, N − 1] and let w ∈ C N denote its nonzero part, i.e., To obtain a well-localized T-F representation, a window should be designed so that its frequency response has a narrow mainlobe and low sidelobe levels under the defined window length N . However, a window function has a tradeoff between the mainlobe width and the sidelobe level. The mainlobe width is closely related to the appearance of the T-F representation and can be intuitively chosen according to the application. Thus, the sidelobe characteristics should be optimized under the mainlobe width ω MW chosen according to the application. The sidelobe energy (SE) and highest sidelobe level (HSL) are used to evaluate the localization of the frequency response of the window functions, which are defined as SE = 10 log 10 where W ωMW (ω) is a weight function, The SE and HSL of a window function are shown in Fig. 3.
Many windows, such as the rectangular window, Bartlett window, Hann window, Blackman window, and Nuttall window [13], depend only on the window length. However, some window functions contain additional parameters for adjusting the bandwidth; these are referred to as adjustable windows. The Dolph-Chebyshev window [5], the Slepian window [6], the Kaiser window [8], the Saramäki window [9], the ultraspherical window [10], [11], the cosh window [12], and the Tukey window (tapered cosine window) [57] belong to this class. Among them, the Slepian window and the Dolph-Chebyshev window are characterized by the following optimization problems: respectively, where W ∈ (0, 1 2 ]. These windows are designed to have a well-localized frequency response in terms of SE and HSL, and they certainly show better characteristics than other window functions (as indicated in the left side of Fig. 10).

IV. PROPOSED METHOD
We now propose a window design method to reduce the influence of the sidelobe of the window derivative on the IF computation. The comparison between the conventional and proposed methods of computing a window function and a window derivative pair is illustrated in Fig. 4. In general, to obtain the pair of a window and its derivative, the window function is first designed, and then the window derivative is calculated by differentiating the designed window [ Fig. 4 (a)]. In contrast, our method first designs the window derivative to minimize the sidelobes and then estimates the window function from the window derivative [ Fig. 4

A. PROBLEM FORMULATION
To restrict the spread in the time direction, we assume that the window derivative is supported on [0, N − 1], i.e., where z ∈ C N corresponds to the nonzero part of g . Calculating the window derivative from the window function can be performed straightforwardly by spectral differentiation. Conversely, from z such that z, 1 N = 0, g can be calculated as the spectral integration: and c is an integral constant. z, 1 N = 0 constrains the integration of the Fourier series to also be the Fourier series. Therefore, we formulate the design of the window derivative as where Θ(z) is an objective function measuring the sidelobes of the frequency response of z, which corresponds to the SE or HSL of z. In summary, we first design the window derivative z by (36) and then estimate the window function from z by (34).
In the remainder of this section, Sec. IV-B and Sec. IV-C, we explain the methods for designing the window derivative to minimize the SE and HSL, respectively. Estimating the original window w from the designed window derivative z is introduced in Sec. IV-D. Hereinafter, the window derivatives minimizing the SE and HSL are referred to as the Slepian window derivative and the Chebyshev window derivative, respectively.

B. SLEPIAN WINDOW DERIVATIVE
This subsection explains the design of the Slepian window derivative. Considering the case where the cost function Θ(z) in (36) is the ratio of the energy outside [−W, W ] to the total energy, the design problem of the Slepian window derivative is formulated as The cost function in Eq. (37) can be rewritten as the Rayleigh quotient: where S N ∈ R N ×N is a real-symmetric matrix whose elements are given by Fixing z T z = 1 and reducing the constant 1 not relevant to minimization, (37) can be rewritten as Such a problem can be simplified to an eigenvalue problem [58]. Let the Lagrangian function associated with (41) be where µ and η are the Lagrange multipliers. The optimal solution z to (41) satisfies the following necessary conditions: Multiplying (43) on the left by 1 T N , η can be calculated as Substituting η into (43), we obtain where the eigenvectors of K N are candidates for the solution to the problem (41). Futhermore, since the eigenvectors of K N satisfy (44) and (45), multipying (47) on the left by z T , we get Therefore, the eigenvector corresponding to the largest eigenvalue µ 0 is the solution to the problem (41). However, finding the eigenvalues of K N is numerically ill-conditioned; likewise, S N . It follows from the following fact and proposition. Fact 2. Let λ 0 , λ 1 , . . . , λ N −1 denote the eigenvalues of S N such that and their corresponding eigenvectors be v 0 , v 1 , . . . , v N −1 , whose norm and signs are determined so that v k = 1, have the following properties [6]: for k = 0, 1, . . . , N − 1.
Proof. According to (52), v k for k = 1, 3, . . . , 2 N/2 − 1 satisfies Then, the following relationship holds: Therefore, λ k and v k are the eigenvalues and eigenvectors of K N , respectively.
Finding the eigenvalues of S N is numerically illconditioned since most eigenvalues of S N are clustered around 1 or 0, as shown by the blue line in Fig. 5. Even if the eigenvalues of S N are nondegenerate, they behave as if they are degenerate because of the rounding error in the numerical computation. According to Proposition 1, its eigenvalues λ k for k = 1, 3, . . . , 2 N/2 − 1 are also eigenvalues of K N . Hence, most eigenvalues of K N are also clustered around 1 or 0, as shown by the red line in Fig. 5.
Here, K N is centrosymmetric since both T N and S N are centrosymmetric [59]. Thus, the eigenvectors are symmetric or antisymmetric, but their order is unclear. When z are symmetric or antisymmetric, (Fz)(ω) can be represented as the cosine or sine series [60]. Fig. 6 shows (Fz)(ω) localized in [−W, W ] under z, 1 N = 0. Note that the linear phase of the frequency response is ignored to make the spectrum realvalued for display. The symmetric case contains an extra extremum in [−W, W ] compared with the antisymmetric case to satisfy the constraint z, 1 N = 0. From this observation, the antisymmetric window localizes the frequency response in [−W, W ] under the constraint z, 1 N = 0 more than the symmetric window. Since v k for k = 1, 3, . . . , 2 N/2 − 1 are antisymmetric from Proposition 1, the eigenvector v 1 corresponding to the second largest eigenvalue λ 1 is the solution to (37). To compute the eigenvector v 1 , efficient methods for computing the eigenvectors of S N [61], [62] are available instead of computing the eigenvectors of K N directly.

C. CHEBYSHEV WINDOW DERIVATIVE
In this subsection, we consider the design problem of the Chebyshev window derivative. Similar to (37), the direct formulation of the design problem of the Chebyshev window derivative is Since the constraint z, 1 N = 0 and the symmetry of the cost function with (37), the solution to (55) should be antisymmetric.
After estimating ω 0 , ω 1 , . . . , ω K using the MRemez algorithm, the Chebyshev window derivative z can be computed from its frequency response by sampling and performing the inverse DFT,

D. WINDOW ESTIMATION FROM THE WINDOW DERIVATIVE
In the previous subsections, two window derivatives were introduced as solutions to (36). In this subsection, the estimation method of the window from the obtained window derivative is presented.
is rewritten as Recall that c is the integral constant, which has to be determined for estimating the window g. In our conference paper [49], estimating c is formulated as the minimization problem of the ratio of the energy outside [−W, W ] to the total energy, similar to the Slepian window, to obtain the well-localized frequency response. An example of the window obtained by the spectral integration from a window derivative supported on [0, N − 1] (N = 7) is illustrated in Fig. 8. As with the spectral differentiation, even if window derivatives are supported on [0, N −1] (N < L), the integrated windows are not supported on [0, N − 1]. The integrated windows are truncated to have the same support as the window derivatives in practice. This may change the frequency response of the window and decrease the accuracy of the IF computation. In this paper, we propose a method for estimating c based on minimizing the truncation effect.
The truncated window outside [0, N − 1] can be expressed using the zero-padding matrix P L,N as P L,N P T L,N (g 0 + c1 L ).
We estimate c by minimizing the squared error of the window before and after truncation, i.e., our proposed estimation problem is formulated as Since this formulation minimizes the energy of the outer components, the proposed integration method is applicable for the case N < L. The proposed integration method is effective when the window width is limited, as in realtime processing, because the effect of truncation can be significant.

V. FREQUENCY RESPONSES OF DESIGNED WINDOWS
The proposed windows are compared with the Slepian window and the Dolph-Chebyshev window in this section. Hereafter, the windows calculated from the Slepian window derivative and the Chebyshev window derivative are referred to as the proposed Slepian window and the proposed Chebyshev window, respectively. First, Fig. 9 illustrates the shapes and frequency responses of the windows and the window derivatives. The length of the windows was set to N = 2 7 , and the signal length for the spectral differentiation/integration was set to L = 2 12 . Each parameter W for the window was chosen so that ω MV = 0.03. Note that the frequency responses in Fig. 9 were normalized such that the maxima are 0 dB. The derivatives of the Slepian and Dolph-Chebyshev windows have nonzero values outside [0, N − 1], as in Fig. 1, which are omitted in Fig. 9(c) and (g) since they are too small for illustration. The red lines in Fig. 9(d) and (h) show the frequency responses when these outside values are truncated. Conversely, the proposed Slepian and proposed Chebyshev windows have nonzero values outside [0, N − 1] due to the spectral integration. The frequency responses of the truncated windows are plotted as the red lines in Fig. 9(j) and (n).
In the time domain, there is little difference among the shapes of the four windows. The derivative of the Dolph-Chebyshev window ( Fig. 9(g)) has a slight oscillation at both ends compared to the others. According to Fig. 9(b) and (l), the Slepian window derivative has a similar sidelobe decay to the Slepian window. Likewise, the sidelobe decay of the Chebyshev window derivative in Fig. 9(f) resembles that of the Dolph-Chebyshev window in Fig. 9(p). In addition, the proposed Slepian and proposed Chebyshev windows have better sidelobe decays than the Slepian and Dolph-Chebyshev windows, respectively, although their highest sidelobe levels are slightly higher. This is because estimating a window from its derivative window in (34) suppresses the high frequencies. Furthermore, Fig. 9(d), (h), (j), and (n) show that the proposed Slepian and proposed Chebyshev windows have smaller effects on the frequency responses caused by truncation compared with the Slepian and Dolph-Chebyshev windows. The result indicates that the proposed design method can reduce the effect of truncation.
Then, the sidelobe energy and the highest sidelobe level of each window at various bandwidths ω MW are summarized in Fig. 10. Fig. 10 also shows the SE and HSL of four wellknown windows: Hann, Blackman, Nuttall, and truncated Gaussian windows. The truncated Gaussian window is represented by where σ is a parameter to control the mainlobe ω MW . Although they were originally defined as continuous functions,  we computed these derivatives by the spectral differentiation as mentioned at the end of Sec. II. The jumps in the mainlobe widths seen in the truncated Gaussian and Dolph-Chebyshev are due to nonzero minima in the frequency responses, as shown on the right side of Fig. 11.
When the bandwidth W was set to a higher value, both the SE and HSL of the Slepian, Dolph-Chebyshev, proposed Slepian and proposed Chebyshev windows decreased. Fig. 10 confirms that the Slepian window derivative and the Chebyshev window derivative were correctly designed with the desired optimality.

VI. EVALUATION OF IF COMPUTATION
In this section, the four windows are compared in terms of the IF computation. Throughout this section, the window length N = 2 7 and the signal length L = 2 12 . Each parameter W for the window was chosen so that ω MV = 0.03.

A. COMPARING WINDOW INTEGRATION METHODS
First, the proposed integration method in (79) was compared with the conventional estimation method (77) [49]. They were evaluated in terms of the error in computing the IF of the sinusoid in (25).
The frequency responses of the estimated windows from the Slepian and Chebyshev window derivatives using the two integration methods are shown in Fig. 12. The SE and HSL of each window are shown in the upper right corner of each subfigure. Although the conventional method determines c by minimizing the energy outside of [−W, W ] to reduce the SE, the frequency responses of the windows estimated by the two methods have comparable SE. However, in both cases of the two window derivatives, the changes due to truncation of the windows estimated by the proposed method were smaller than those of the conventional method.
The errors of the IF using two integrated windows are shown in Fig. 13. The errors using the conventional method increase due to truncation, but this is not seen in the results of the proposed method. This suggests that the proposed method can estimate the window to avoid an increase in the error due to truncation effects. Additionally, with or without truncation, the errors using the proposed method are smaller than those using the conventional method in [0, ω MW ]. This may be because minimizing the outer energy reduces the oscillation of the frequency responses.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

B. COMPARISON OF WINDOWS FOR IF COMPUTATION OF A SINUSOID
Second, the four windows were compared in terms of computing the IF of a sinusoid. They were also evaluated by the error in (26). The IF errors using the four windows are shown in Fig. 14. Regardless of truncation, the errors using the proposed windows are smaller than those using the conventional windows. Especially, the error using the proposed Chebyshev window was smaller than that using the Slepian window in Fig. 14    improve the accuracy of the IF computation of the sinusoid.

C. COMPARISON OF WINDOWS FOR IF COMPUTATION IN THE PRESENCE OF ANOTHER SINUSOID
Then, we consider estimating the IF of the sinusoid s from a signal composed of two complex sinusoids, where for l = 0, 1, . . . , L − 1. s and i correspond to the target and interference signals, respectively. The IF of x is expressed as where The error of the IF between s and x is given by According to (12), the real part of the numerator of (85) is zero since the STFT magnitude of a complex sinusoid is time-invariant. Hence, (85) can be rewritten using IF i g and IF s g as The transitions of e associated with the initial phase φ s , φ i , and the time index n are irrelevant for the evaluation. Moreover, from (25), the IF of a complex sinusoid is constant regardless of the time index n. Therefore, we consider the worst case of e in any ∆φ. Regardless of the sidelobe level, the error e becomes large when m/L is outside the mainlobe of the target signal or inside the mainlobe of the interference signal. Under these conditions, it is difficult to obtain a meaningful IF unless the amplitude ratio is quite large. Thus, we consider the case where m/L is inside the mainlobe of the target signal and outside the mainlobe of the interference signal. Assume that the amplitude ratio r satisfies |(Fg)(m/L − ξ s )/(Fg)(m/L − ξ i ))| > 1/r. This assumption is mild in the condition we consid- The deviation of (87) is shown in Appendix B. (87) indicates that as the amplitude ratio r decreases, the error of the IF increases. Fig. 15 plots the error of the IF computation in (87) for r = 1. The central bright areas in the four results correspond to cases in which the two mainlobes overlap. The top and bottom bright regions in the four results represent the errors of the IF outside the mainlobe of s. In addition, the error at m/L = ξ s in Fig. 15 is shown in Fig. 16.
The IF using the Dolph-Chebyshev window had the largest error among the four windows. In contrast, the results for the proposed Slepian window had the smallest error. The results using the Slepian window and the proposed Chebyshev window had comparable errors. These results suggest that reducing the sidelobe of the window derivative decreases the error of the IF computation.

VII. APPLICATION TO FSST
As an application for the IF computation, the proposed windows were applied to the FSST of an artificial signal and that of a speech signal. The FSST of a signal f with a window function g is defined as and δ[l] is the Kronecker delta. We used the Rényi entropy as a metric of the energy concentration of the FSST spectrogram [67]. The Rényi entropy of the FSST spectrogram is given by with α > 2 being recommended for the T-F domain measures [67]. α = 3 is chosen throughout the experiments. As shown in Fig. 10, sidelobes can be reduced by increasing the mainlobe width. In contrast, if the mainlobes of multiple components overlap, the error of instantaneous frequency becomes large. Therefore, an appropriate mainlobe width needs to be selected to reduce both effects. The Rényi entropy has also been used as a metric to select the variance in the Gaussian window in STFT and FSST [68], [69]. Since the mainlobe widths of the designed windows will also affect the Rényi entropy of the FSST, we evaluated the Rényi entropy of the FSST for various mainlobe widths.
To further evaluate the FSST of an artificial signal, we also used the Earth mover's distance [70]. The Earth mover's distance is an index that measures between two distributions and has been used to evaluate the synchrosqueezing-based   method [34], [35], [71]. This evaluation consists of averaging the 1D Earth mover's distance between the estimated T-F representation and the ideal representation at each time index n. If the ideal representation is known, the Earth mover's distance is a more valid metric for estimating the T-F representation than the Rényi entropy.

A. FSST OF ARTIFICIAL SIGNALS
The proposed windows were evaluated with the FSST of a real-valued artificial signal, which contained a sinusoid, a linear chirp, and a quadratic chirp. The window length was set to N = 2 6 because the calculation of the Rényi entropy of FSST was numerically unstable when N = 2 7 . The Rényi entropies of FSST with different mainlobe widths ω MW are plotted in Fig. 17. Table 3 shows the minimum values of the Rényi entropies in Fig. 17 and the corresponding bandwidths. The widely used truncated Gaussian window shows poor performance compared with the other four windows. The Slepian window and the proposed Chebyshev window show approximately equivalent performance. The proposed Slepian window achieves the best performance among the five windows. Then, the Earth mover's distance of FSST with different mainlobe widths ω MW are shown in Fig. 18, and their minimum values are shown in Table 4. Although the mainlobe widths that take the minima are different, the FSST spectrogram using the proposed Slepian has the smallest distance. These results indicate that the proposed Slepian window provides the most concentrated FSST spectrogram of the five windows. Fig. 19 illustrates the T-F representations obtained by the VOLUME x, xxxx  STFT and FSST in the case of Table 3. Focusing on the enlargement in the red box (right column of Fig. 19), it can be confirmed that the component outside the center frequency is reduced using the proposed Slepian window. The FSST spectrogram using the proposed Chebyshev window is almost as sharp as that using the Slepian window.

B. FSST OF SPEECH SIGNAL
Then, we considered a speech signal with a sampling frequency of 7,418 Hz. The window length was set to N = 2 7 . Since the ideal T-F representation is unknown, the Earth mover's distance cannot be used for evaluation, and only the Rényi entropy is used in this experiment. Fig. 20 and Table 5 show Rényi entropies of SST with different bandwidths and their minimum values, respectively. As in the case of the synthesized signal, FSST with the proposed Slepian window achieves the best performance in terms of the Rényi entropy. Fig. 21 depicts the T-F representations obtained by the STFT and FSST in the case of Table 5. Although the appearances of the spectrograms are similar, the proposed Slepian window provides the sharpest FSST spectrogram of the four windows. The results confirmed that the proposed Slepian window provides the sharpest FSST spectrogram of the four windows for the real speech signal as well.
Throughout the experiments in this section, the proposed Slepian window showed the best performance. Therefore, the proposed Slepian window can be useful for FSST applications. However, the proposed Chebyshev window may be preferable depending on the signal of interest because the sidelobes of the Chebyshev window derivative near the mainlobe are smaller than those of the Slepian window derivative.

VIII. CONCLUSION
In this paper, we proposed a framework to design a window function for the IF computation. The proposed method first designs the window derivative to minimize the sidelobes and then estimates the window function from the designed window derivative. We designed two minimum-sidelobe derivatives to minimize the SE and HSL, referred to as the Slepian and Chebyshev window derivatives, respectively. The integral constant that appears when integrating the designed window derivative is estimated to minimize the truncation effect. In the IF computation of a sinusoid, the proposed windows reduce the error compared with the results of certain other windows. In addition, the proposed Slepian window showed the best performance in the FSST among the several windows. Some phase-aware techniques use not only the IF but also the group delay and higher-order derivatives of the phase   [33]- [35]. Therefore, future work will include generalizing the proposed method to these elements and an investigation of the method's computational efficiency. .

APPENDIX A TRIGONOMETRIC REPRESENTATION OF THE FREQUENCY RESPONSE OF THE WINDOW DERIVATIVE
This appendix explains the derivation of (56) and the relation between α and z based on [60]. The frequency response of the antisymmetric window z can be represented as a sine series. When N is odd, Hence, using (57) and (58), (Fz)(ω) in the odd and even cases can be uniformly rewritten as (Fz)(ω) = e i( π 2 −(N −1)πω) Q(ω)P (ω, α), which equals (56).
After obtaining the solution α by solving (61), the window derivative z is calculated using (95) and when N is even.
(104) Because we assume that the amplitude of the target signal is greater than the amplitude of the interference signal, r > 1. In addition, if m/L is inside the mainlobe of the target signal and outside the mainlobe of the interference signal, |(Fg)(m/L − ξ s )/(Fg)(m/L − ξ i )| > 1. Hence, ρ must be greater than 1. Thus, the denominator of (104) is not zero, and ∂e ∂ϕ becomes zero if ϕ = 0 or ϕ = π. Substituting ϕ = 0 or ϕ = π for (102), The difference between the squares of these is Hence, the worst-case error is e π , and e π is equal to (87) by substituting ρ = r |(Fg)(m/L − ξ s )/(Fg)(m/L − ξ i )|.