A Comprehensive Overview on Modified Versions of Stockwell Transform For Power Quality Monitoring

The increasing trends toward the accurate identification of power quality disturbances (PQD) via power quality (PQ) monitoring require an appropriate digital signal processing (DSP) technique and a robust classifier. To this end, Stockwell transform (ST), one of the most efficient feature extraction DSP tools, and its several variants play an utmost role in PQ assessment framework. Its time-varying spectral characteristics generally extract the local instantaneous frequency spectrum from the global temporary behavior of PQD signal. However, the Standard ST suffers from the poor time-frequency resolution because of its frequency-dependent Gaussian window (GW). While the analysis of the statistically time-varying signals requires a suitable balance between time and frequency resolution. To this end, this paper provides a comprehensive literature review on several modified versions of Standard ST for the first time to reduce the computational complexity of the algorithm as well as maximize the energy concentration of the time-frequency plane. A comparative analysis of all the modified STs has been presented in tabular form to provide the key characteristics of each technique. Additionally, a case study has been presented to substantiate the highest accuracy of the proposed algorithm over the other ST variants. Apart from the PQD classification, miscellaneous applications of Standard ST and its modified variants have been indicated. This review paper may provide a valuable resource to the researchers for further improvement of the time-frequency resolution of ST not only in classifying PQD but also for its other wide applications.


I. INTRODUCTION
In recent years, several power quality (PQ) monitoring algorithms developed for the identification of power quality (PQ) issues have gained widespread attention from industries and researchers [1]. The primary reason behind this is the proliferation of non-linear loads, solid-state switching devices, power electronics converters, power transfer The associate editor coordinating the review of this manuscript and approving it for publication was Nagesh Prabhu . switches, and protective equipment. Moreover, the grid signals have become more distorted due to the disturbed generation and the excessive use of renewable energy sources [2], [3]. Therefore, the nature of the PQ signal is continuously changing due to the aforementioned reasons which demand more accuracy in the identification of the complex PQ disturbances. For this purpose, various online and offline approaches to PQ monitoring are available in the literature which majorly involves two stages: time-frequency analysis or the feature extraction stage and the classification VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ stage. Further, this classification framework plays a vital role in developing power quality disturbance (PQD) mitigation approaches by recognizing its underlying reason [4], [5], [6]. The commonly known PQD are an interruption, oscillatory transients, voltage sag/swell, flicker, etc. To maintain the reliability of the power transmission and distribution network, accurate detection and mitigation of PQD are essential to improve the PQ and it is only possible by adopting highly efficient methodologies in each stage of PQD classification [7]. The time-frequency representation converts the 1-dimensional time series into 2-dimensional data of time and frequency which shows the deviation of spectral contents of PQD signal over time [8]. The most used time-frequency representation techniques for non-stationary signals are shorttime Fourier transform (STFT), Hilbert Huang transform (HHT), Wavelet transform (WT), Gabor transform (GT), Chirp transform (CT), Stockwell transform (ST), Slant transform (SLT) etc. [9], [10], [11], [12]. All the above-mentioned digital signal processing (DSP) techniques come under nonmodel-based (or nonparametric) methods. In these methods, there is no requirement of prior knowledge of signals. But it is suffered from low-frequency resolution issues which further depend on the length of the signal being analyzed. To overcome this problem, there is another category of signal processing technique known as model-based (or parametric) methods. In this tool, the model information (e.g., harmonic model or autoregressive model) from which a signal is generated may be recognized based on the prior knowledge of a disturbance. MUSIC methods, Kalman filter (KF), and ESPIRIT methods are some of the common methods used for harmonic modeling [2], [3]. Before choosing an appropriate DSP technique for PQ assessment, its characteristics and disadvantages should be known. For instance, STFT provides the phase and frequency information of the local sections which vary over time. Although it provides some information on time and frequency content present in a signal, the fixed window size is the main drawback associated with it. The authors in [13] presented the mathematical description and some specific applications of windowed DFT or STFT which is used to classify the PQ issues as per IEEE standard 1159 [14]. The next logical step in the field of DSP techniques is WT which is having the concept of variablesized windows. WT has been implemented for the first time in [15]. It performs a very significant role in recognizing the PQ disturbances. WT uses the concept of multi-resolution analysis (MRA) which makes it a very powerful feature extraction tool. Various classifications of WT have been used in one-dimensional or two-dimensional forms like continuous wavelet transform (CWT), complex continuous wavelet transform, discrete wavelet transform (DWT), and dual-tree complex wavelet transform (DTCWT) [16], [17]. The more computational complexity, noise sensitivity, and dependency on the mother wavelet are some of the disadvantages associated with WT [9], [10]. The ST proposed by Stockwell et al. [18], overcomes the shortcomings of STFT and WT. It is one of the most powerful PQ assessment techniques available in the literature because of its frequency-dependent Gaussian window (GW) in which window width is inversely proportional to the frequency of a signal. Thereby, the window width is increased to provide better frequency resolution while it is decreased to provide better time resolution [19], [20], [21], [22]. Despite the presence of varying GW width, the Standard ST is having fixed time-bandwidth product due to which it can give a false result in certain cases e.g., a momentary interruption signal would be identified as a sag signal [23]. Therefore, a signal-dependent window is required rather than a frequency-dependent window to maintain a suitable balance between time and frequency resolution as well as to enhance the energy concentration of the time-frequency plane. This involves the following aspects: selection of GW, selection of GW parameters, and adjustment of GW parameters as depicted in Fig. 1.
To obtain a high time resolution in low-frequency band and high-frequency resolution in high-frequency band, two or more GW have been proposed in the literature. The selection of GW parameters also plays a dominant role for this purpose. To examine GW, different scaling rules have been adopted in the literature and it is found that a large number of GW parameters gives more flexibility to enhance the energy concentration. The third aspect i.e., adjustment of GW parameters decides the detection capability of ST for analyzing different PQD signals. Various heuristic approaches have been adopted to fix the values of GW parameters. But a fine optimal solution may not be possible by implementing such rigorous approaches. The optimal selection of GW parameters can be found in several modified versions of ST [24], [25], [26], [27], [28]. Most of the window width optimized ST methods utilize the concept of non-linear inequality constrained optimization problem in which energy concentration measure (ECM) is selected as an objective function and inequality constraints decide the boundary conditions of GW. Genetic algorithms (GA), Ant Colony Optimization (ACO), Particle Swarm Optimization (PSO), etc. are some of the optimization techniques to adjust these window parameters [29], [30], [31], [32]. On the other hand, artificial neural network (ANN), Fuzzy logic (FL), Bayesian classifiers (BC), support vector machine (SVM), decision tree (DT), nearest neighbor algorithm, Hidden Markov model (HMM), ensemble decision tree (or random forest) are some of the common classifiers available in the literature [33], [34], [35], [36], [37], [38], [39], [40], [41], [42], [43]. Their inputs are the statistical features extracted from the DSP techniques.
No attempts have been made in the literature to provide a review on ST and its variants. Though several improved ST variants are coming year by year to provide better energy concentration of time-frequency plane along with improving the computational complexity. For this purpose, this novel review manuscript presents a comprehensive overview on several versions of ST for the first time not only for PQ monitoring but also for miscellaneous applications like biomedical engineering, seismography, fault detection etc., because these applications cannot be left untouched in order to provide a review in the chronological order. In the series of improving time and frequency resolution, several versions of ST have been modified from the techniques used in applications like engine knock signals, biomedical signals etc., and these versions have been extensively used for the purpose of PQ monitoring.
In addition to it, a case study has been presented to validate the performance of the proposed algorithm over the previous variants. Section 2 describes the basic mathematical equations of Standard ST along with the derivation of discrete GW. The necessity of modifying Standard ST due to its inherent disadvantages has been discussed in Section 3. Several modified versions of Standard STs are presented in Section 4 along with their comparative analysis followed by a case study in section 5. Section 6 illustrates the miscellaneous applications of Standard ST as well as modified STs. The final concluding remarks with recommendations have been given in Section 7.

II. STANDARD ST
There are two methods for obtaining the Standard ST mathematically either using STFT or CWT. If τ, f and w(t) represent the time, frequency, and window function then the STFT of a signal x(t) can be expressed as, The Standard ST can be derived by replacing the fixed window function of STFT with scalable and movable Gaussian function, If p is the wavelet width, then CWT expression is expressed as, The Standard ST can be obtained by modifying the phase information in CWT as, Thereby, the Standard ST facilitates the additional Characteristics to either STFT or CWT. Here factor p denotes nothing but the Standard deviation (σ ) of the GW which is inversely proportional to the frequency f . To obtain the discrete version of the GW, firstly differentiation of (2) must be performed with respect to time, By applying the Fourier transform on each side of (7), Further integrating on both sides, Due to the normalized nature of Gaussian, the DC component G(0) = 0, so (11) can be rewritten as, Applying the exponent on each side results into, Finally, the discrete version of GW of Standard ST can be obtained by making ω = 2π α, α → m/NT , f → n/NT , where m = n = 0, 1, 2 . . . . . . , N − 1.
The Standard ST provides complex (real and imaginary) time-frequency spectral localization with a direct connection, via time averaging, with Fourier spectrum. The features extracted through the transformed contours are used as the inputs for the intelligent classification framework. The various advantages, associated with ST, are continuously attracting the researcher's mind for two decades. The authors in [19] analyzed the PQ signals using modified WT (or ST) i.e., a slight modification of local spectrum patterns or phase correction in WT. This technique provided a good timefrequency resolution. ST and modular neural network-based PQD recognition has been presented in [44]. In [45], ST was implemented on eight single-stage PQ disturbances and two complex PQ disturbances with a decision tree as the classifier. The ST-based probabilistic neural network algorithm has been presented in [46] for eleven types of PQD classification.
This technique also reduces the number of features without losing their original characteristics. The ST-based recognition of single-stage and multiple PQ disturbances was proposed in [20]. The suitable features extracted from the transformed curve were provided as an input to artificial neural network and decision tree-based classifier. In [47], high impedance fault detection has been performed using ST which extracts the third harmonic component of phase angle of the current waveform. The authors in [48] recognized various underlying causes and types of PQ disturbances using a computationally efficient S-Transform-based decision tree.

III. NEED OF MODIFICATIONS IN STANDARD ST
The shape of a window used for determining the energy concentration of time-frequency distribution plays a pivotal role in any DSP technique. In Standard ST, the window function is narrower in the time domain for high-frequency analysis, resulting in poor frequency localization and it provides higher computational complexity O(N 3 ). On the other hand, the possibility of magnifying noise amplitude in high-frequency regions and the correlation of several samples in the spectrum makes this approach compromising for practical applications like PQD classification. Thus, many attempts have been made in optimizing ST to improve energy concentration in the timefrequency domain and make it faster [49].

IV. MODIFIED VERSIONS OF STANDARD ST
The different versions of ST used for feature extraction of simple and complex PQ disturbances have been discussed in this section.

A. BASED ON WINDOW SELECTION
The other windows instead of symmetrical GW have also been proposed as a kernel of ST to satisfy the different criteria. To this end, an unsymmetrical bi-GW, made of two half GW was proposed in [50] to improve the time resolution in the time-frequency domain which is poor in Standard ST due to the long front taper of GW. From (2), the expression of a specific window function, w GS corresponding to a specific form of Standard ST can be written as, The Standard ST can be obtained by putting γ GS = 1. To improve the front time resolution of w GS , a very low value of γ GS is required which inherently degrades the frequency resolution resulting in a trivial time-frequency spectrum. As an alternative, a bi-GW function, w BG is introduced for the visual identification of transition segment using two half GWs with different front and back tapers, BG (τ −t)] 2 and the rate of taper of w BG as a function of (τ − t) i.e., γ # BG (τ − t) is given by, The time resolution is controlled by decreasing the front taper and frequency resolution is improved by increasing the back taper, resulting in overall improved time-frequency resolution. In [51], a hybrid approach based on ST and dynamics (Dyn) is proposed in which firstly the location of signal components has been identified by Dyn followed by fast Fourier Transform (FFT), and Inverse FFT is applied to only a few frequency components. Two GWs, G1(a < 1) and G2 2 (a > 1), where a is the adjustable parameter for tuning the GW, were proposed in this paper to provide better flexibility and adaption, first for low-frequency components (f < 350Hz) and second for high-frequency components (f > 350Hz), An adaptive Dolph-Chebyshev window (DCW) was proposed in [52] in place of GW for time-frequency analysis of non-stationary signals like multi-component signals and frequency modulation signals. The origin DCW is given by, where, k 0 = cosh(1/N − 1 cosh −1 1/r) and k th order Chebyshev polynomial, T n (k) is given by, The value of q for the DCW is, where, β is the controlling parameter that deals with the balance between ST and STFT, z defines the rate of change of DCW width of frequency f , η is the factor of DCW, and o defines the mode of change of DCW width. The adaptive DCE is obtained by fixing β = 0 and varying the other parameters of q i.e. z, o, η to tune the DCW which enhances the energy concentration in the time-frequency plane.
A double-resolution ST (DRST) was proposed in [23] to reduce the computational complexity without losing the necessary information present in the signal to be analyzed. This approach provides accurate frequency extraction because of variable time-bandwidth product (unlike in Standard ST), by ignoring the unnecessary frequency components present in the signal. The window functions for DRST can be defined as, where, f and f 0 are the signal and fundamental frequency respectively. β 1 and β 2 are the adjustable parameters to provide a variable time-bandwidth product. By putting β 1,2 = |f |, DRST becomes Standard ST. Moreover, β 1,2 > |f | is desirable when the signal contains only main frequency components such as pure voltage sag, swell, etc. to provide better time resolution. On the other hand, β 1,2 < |f | should be adopted to control frequency resolution for the signals like transient, harmonics which contain several frequency components. Due to the availability of complex signals in real-time, the frequency spectrum is generally divided into two parts namely the low-frequency part and high-frequency part and different values of β 1 andβ 2 have been used to meet the concept of double resolution.
In [53], the authors presented a novel optimally concentrated discrete window (OCDW) with a new scaling criterion that works on a constraint optimization problem having an objective function, to maximize the product of energy where Here σ defines the value of M and L at the lowest (n = 1) and highest frequency (n = N /2). The authors in [49] proposed an optimized ST for the detection of complex PQ disturbances and thus improving the time-frequency resolution. This paper overcomes the problem with [23] in which DRST fails for mixed or complex PQ disturbances due to fixed β 1,2 . The proposed algorithm adjusts β 1,2 dynamically by optimizing the energy concentration which is a function of β 1 and β 2 . To this end, it is found that the proposed optimized ST provides the energy more concentrated as compared to DRST even in the case of nonlinearity mixed complex PQ disturbances. A digital prolate spheroidal window (DPSW) based modified ST [54] has been proposed for the accurate detection of voltage sag characteristics like duration, depth, and phase angle jump. The solution of the following equation gives zero-order discrete prolate spheroidal sequence i.e., w i n (N , χ) pass through a low pass filter h(m), where N is defined as window length and i, n = 0, 1, 2 . . . . . . ..N − 1. χ denotes the required main lobe normalized frequency from 0 to 1/2. λ i (N , χ) indicates the energy ratio defined for each eigenvector of (30). The eigenvector w 0 n (N , χ) corresponds to the highest eigenvalue λ 0 (N , χ) among all the eigenvectors, is chosen here as a DPSW which results in the highest energy aggregation. By putting, χ = 173.0512e (−11/f ) , better accuracy is achieved for the problem stated above.
A kaiser window with a designed control function is proposed in [55] for ST which provides better frequency resolution for the disturbances like harmonics, transients, etc., and better time resolution at a fundamental frequency for checking the amplitude of the disturbances like sag, swell, etc. Here, the attributes of the fundamental optimal energy concentration are chosen as the kernel function which is dependent on the detection demand. This kaiser ST is made up of STFT and the Standard ST. The mathematical expression for the kaiser window function is, where I 0 is the zero-order Bessel function and α is a function of f .

B. BASED ON NUMBER OF GW PARAMETERS AND THEIR ADJUSTMENTS
The second category of modified STs is concerning the number of GW parameters used and how they are adjusted in literature. In [24], [56], the spread of GW (σ ) is modified with the introduction of one parameter (α), The improved frequency resolution and time resolution have been obtained with (α > 1) and (α < 1) respectively. Further, a window width optimized ST (WWOST) has been found in [25] in which two optimization algorithms for constant and time-varying window width respectively have been proposed. The former can deal with low/slowly varying frequency components while the later is designed for highfrequency components. To improve the energy concentration, the standard deviation (σ ), is modified as, with this newly introduced parameter p, the window will become wider and narrower in the time-frequency domain for the cases p > 1 (corresponds to the first optimization algorithm for fixed window width) and p < 1 (corresponds to the second optimization algorithm for varying window width) respectively. Whereas p = 1 value depicts the outcomes of Standard ST. A modified GW has been proposed in [57] for the power signal clustering problem using a fuzzy C-means particle swarm optimization algorithm. Two positive scaling factors (a and b) have been introduced here to provide better control of time and frequency resolution, where k ≤ √ a 2 + b 2 A Cross-spectral modified ST approach has been proposed in [26] with a scaling factor (γ ) which is defined to vary with frequency linearly for better progressive control of GW, where, k is the intercept, and m is the slope for a linear variation in frequency. This approach is defined for phase synchrony and coherence analysis. In [58], the authors proposed a modified frequency scaling scheme for a fast adaptive discrete generalized ST. The Standard ST computes the time-varying spectral characteristics at all the frequencies even for the irrelevant frequencies too owing to its linear frequency scaling scheme. The proposed method follows the selective frequency scaling and window cropping schemes to include only the significant frequency functions. Additionally, the window function is folded at the cropped points several times to reduce the effect of aliasing due to the discretization of samples. The spread of GW is dependent on three newly introduced parameters (α, β, γ ), where, r ≤ α 2 + β 2 , which is the window width factor. γ controls the rate of change of window width. α denotes the tradeoff between ST and STFT. β defines the mode of change of window width. The proposed methodology has been applied to estimate the PQ indices accurately.
A Fast discrete ST (FDST) has been introduced in [59] for multiple power quality disturbances with a modified GW having four parameters, (37) where, r and c denote the scaling factors to control the oscillations. On the other hand, a, c are the positive parameters. By varying c from 0 to 1, some damped hidden frequencies can be captured. Increasing the parameter r corresponds to a broader window in the time domain thus improving frequency resolution. Two novel frequency scaling/partitioning schemes along with bandpass filtering are introduced to reduce the computational cost and thus provide a higher speed of convergence of the proposed algorithm. To overcome the disadvantages of dyadic scaling in which some of the significant frequency components can be missed, automatic frequency scaling and power signal analysis scaling is proposed to reduce the computation. A case study presented in the paper shows the least computational time of FDST with automatic scaling giving the best results. After that, a bandpass filtering or window cropping is applied where the cropped GW has been multiplied with the FT of the disturbance signal resulting in a smaller number of computations. The cropped window width is selected as per the analyzed frequency components to satisfy the uncertainty principle. A sigmoid modified ST has been proposed in [60] to control the GW width. By tuning some parameters of the sigmoid function, GWs with different widths have been obtained for several frequency components. This methodology is applied for vibrational monitoring of water pipes, and it is giving better results as compared to the previous linear modified ST [26] and power modified ST [28]. Here, the GW width is controlled by the sigmoid function as follows, where β denotes the tuning parameters which are further responsible to control the width of the sigmoid function f (a, b), S(f ) is a function of two tuning parameters a (for amplitude control) and b (for shape control) and maximum analysis frequency f m . erf (χ) is a function of Gauss error which is defined as, A novel hybrid GW has been proposed in [27] to overcome the drawbacks of the previous versions of ST, Here, the parameter (f r /mf p + k) denotes the frequency cycles within one σ of GW. All four parameters, r, k, p, m provide more adaptability and flexibility to control the GW width. To optimize the ECM, a constrained optimization problem with non-linear inequality constraints has been proposed in this paper. The GA approach has been used to tune all four parameters and thus increase the time-frequency resolution even in the presence of additive Gaussian noise.
In [28], a modified optimal FDST has been proposed for the detection of single and multiple PQ disturbances. Like the previous paper, this paper also deals with the problem of maximizing the ECM as an objective function of an optimization problem with a signal-dependent window instead of a frequency-dependent window. As a result of which, a sharper energy concentration in time-frequency distribution is achieved. The standard deviation of the proposed GW having four parameters varies as, where a changes the mode of GW width, b denotes the window factor, c decides the tradeoff between STFT and ST, and d corresponds to the rate of change of GW width. The expression of ECM using this modified optimal FDST is obtained as, Performance comparison of all the major modified versions of ST is depicted in Table 1 illustrating the key advantages and limitations of each technique. Further, a suitable version of ST is required with more energy concentration, less computational complexity, and high time-frequency resolution for an application. To this end, a newer version is proposed in this manuscript which improves the accuracy of ST version proposed in [28] i.e., the modified optimal FDST. In [28], the values of a,b,c and d are tuned by maximizing ECM taking into account the whole ST matrix. On Contrary, the proposed methodology maximizes the objective function of the i.e., ECM by considering only one row of ST matrix corresponding to the fundamental frequency component (f n ) i.e., 50 Hz. The normalized proposed ST is given by, Now, this optimization problem is having proposed ECM with two inequality constraints and a boundary condition for four parameters which should lie between 0-2, inequality constraints, The GW width puts constraints on this optimization problem to maintain a suitable tradeoff between time and frequency resolution. Here, f max is decided based on the analyzed signal. The sampling time is denoted by T s and the value of n is chosen as 3 to provide minimum time resolution. Though the standard deviation of GW in the proposed ST is same as in Equation (42), but the method for maximizing the ECM is different which gives a significant reduction in computational time and complexity as compared to the previous one described in [28].

V. CASE STUDY
The accuracy of Standard ST (oldest) [18], Modified ST (newest) [28], and the proposed ST, in phase angle jump (PAJ) estimation, have been investigated for a voltage sag signal. The PAJ is nothing but the shift in voltage zero crossings which is further helpful to determine the cause of a PQD disturbance [48]. It is the largest value of phase angle excluding the transition segments [61]. The complete details of PAJ and the different techniques for its estimation have been presented in [48], [61]. Initially, a synthetic voltage sag signal is generated in MATLAB as per the international standards IEEE 1159 [62] and IEC 61000-4-30 [63] with a sampling frequency and aggregation period of 3.2 kHz and 0.2s respectively. An original PAJ of 30 0 and the white gaussian noise of 30 dB signal to noise ratio (SNR) and 20 dB SNR respectively have been added to make it like the realtime PQD. By putting a,b,c, and d values in GW of [28] equal to 1,0,0, and 1 respectively, it will be resulted in Standard ST thus making it only frequency dependent.
The modified ST [28] has been implemented in MATLAB using GA which maximizes the ECM of the whole ST matrix of dimension 640 × 640. Further, the voltage sag signal analysis has been done using the proposed ST in which ECM function of Equation (45)
The adaptive ST has been utilized in [69] for the recognition of microsatellites in DNA. In [70], the authors proposed a modified window ST for the accurate identification of electromyograms in which a genetic algorithm is used to optimize the window parameters. Further, ST is used to localize the hotspots in tubulin which provides new insights into developing new anti-cancer drugs [71]. In [72], a twodimensional discrete orthogonal ST has been employed for feature extraction from brain magnetic resonance imaging (MRI).
In the mechanical engineering field, ST was used in [73] for the early detection of vibrational signals from the gearbox to protect the mechanical system from failure. Further, the performance of ST was compared with the selective regional correlation technique in [74] for the diagnosis of faults in machine tools. In Geoscience field, ST finds its application to calculate the localized spectrum of seismic cross-sections for color display [75]. Further, ST was utilized in [76] to calculate P-wave arrival time in noisy seismic data. In [77], synthetic aperture radar image despeckling has been done by 2D ST shrinkage technique. An amplitude preserving ST was proposed in [78] for the compensation of seismic data attenuation. A novel synchrosqueezing ST was suggested in [79] for the decomposition of seismic data. The energy concentration of time-frequency plane is further improved in [80] by adopting a synchrosqueezing generalized ST for seismic application. A three-parameter-based ST was proposed in [81] for seismic data analysis. Moreover, a modified ST with an asymmetrical kaiser window was implemented in [82] on seismic signals for the detection of the event's onset effectively. In [83], the ST tool is used for accurate scanning of PP and PS waves. Again, the synchrosqueezing generalized ST was recently used in [84] for seismic time-frequency analysis with matching demodulation as a preprocessing step. In [85], a novel multisynchrosqueezing generalized ST was implemented for the accurate recognition of tight sandstone gas reservoir.
Several state-of-the-art literatures are found for ST in fault identification. In this series, a hyperbolic ST was implemented in [86] for non-intrusive fault monitoring in a wide area measurement system. Further, the authors in [87], utilized the concept of ST for detecting the location of partial discharge source (PDS) with the help of signals captured using an optical sensor. In [88], the fault in the stator winding of the induction motor has been identified with the help of ST and random forest by sensing the stator current signals. Again, ST estimates the statistical parameters like total harmonic distortion in [89] and classifies the faults in a gridintegrated wind energy system. In [90], a novel fast discrete orthogonal ST was proposed for micro phasor measurements which further detects faults and islanding. A locally demagnetized fault recognition system using ST was proposed in [91] for permanent magnet linear synchronous motor. The authors in [92] implement ST for the detection of broke bar faults which gives information about the fault severity even in the very short starting duration and noisy conditions. In [93], a novel ST with adaptive adjustment was proposed for a VSC-based DC power system network to protect the DC grid against short circuit faults. Further, ST was utilized in [94] for the protection of the distribution feeder. A hyperbolic ST is implemented in [95] for transformer differential protection against cross-country faults which are the faults that occur at two different locations within the same circuitry. Moreover, a power calculation approach for non-stationary signals has been presented in [96] using ST and current's physical components power theory.
In [97], the faults associated with direct lightning strikes have been identified with the help of ST and mahalanobis distance. Again, a hybrid combination of ST and affinity propagation clustering is used in [98] to separate two PDS of oil-paper insulation. Authors in [99] implemented a hyperbolic window ST for the estimation of the contamination level of the overhead insulator by analyzing surface leakage current signals in the time-frequency domain.

VII. CONCLUSION
The principle normative of this paper is to provide the major reported literatures of several modified variants of ST illustrating their characteristics, for the accurate recognition of PQD signals. To this end, these variants are classified based on the window used and a number of parameters along with their tuning by an optimization technique. The purpose of all these variants is to optimize the ECM in the time-frequency plane and to reduce the computational time. A case study is presented to prove the highest accuracy of the proposed algorithm in PAJ estimation over the other variants of ST which makes the proposed method suitable for other PQD signals like swell, transient, interruption etc. along with complex signals. This paper also indicates several other diversified areas of ST like atmospheric physics, cardiovascular time series analysis, seismography, biomedical science, etc. There is still a lot of scope for research in this field as the time and frequency resolution can only be improved up to some extent at one instant of time because of Heisenberg's uncertainty principle. Before mitigating a complex PQD, a precise assessment of its type as well as the underlying cause is needed, and this accuracy is dependent on the extracted PQ indices from the transformed curves which are further fed to the classifiers. Thus, the proposed study may give the direction to develop new ST-based methodologies which should be able to meet all the key requirements like the accurate statistical value of PQ indicators, feasibility of online as well as offline implementation, noise immunity, and classifying complex PQD.