Partial Discharge Signal Denoising Based on Wavelet Pair and Block Thresholding

In electrical engineering, it is of great importance to monitor partial discharge (PD) of high voltage apparatus. However, on-site PD signal easily gets corrupted by white noise, and hence denoising the measured PD signal is necessary to acquire pure PD. In traditional method, single wavelet and regular thresholding are utilized. By combining wavelets of the same family, dual-tree complex wavelet pairs can be constructed. Specific combinations are approximately translation invariant, and they are also able to achieve better denoising effect without sophisticated computation. Meanwhile, considering the correlation between adjacent wavelet coefficients, the block thresholding is employed. In this paper, it is proposed to combine wavelet pair and block thresholding for PD denoising. There are four combinations based on single wavelet/wavelet pair and regular thresholding/block thresholding, and they are compared with each other. Based on a numerical study, the results demonstrate that the proposed algorithm outperforms the traditional method.


I. INTRODUCTION
In power system, partial discharge (PD) is one of the most common factors damaging the insulation of high voltage apparatus. Its long-term existence will deteriorate their insulation [1]- [3], which may lead to serious accidents such as a large-scale power outage. In order to inspect the insulation status, online PD monitoring is necessary for critical apparatus such as transformer and gas insulated switchgear (GIS). On-site PD signal tends to be weak and easily gets corrupted by white noise. To obtain pure PD signal, denoising is needed for measured data. There are mainly two denoising strategies, i.e. singular value decomposition (SVD) denoising method and wavelet thresholding denoising method.
In SVD denoising method, a Hankel matrix is firstly constructed from the noisy PD signal, and SVD is applied to the matrix in order to remove the smaller singular values for denoising [4]- [6]. However, this method may consume too much time.
The original wavelet thresholding denoising method is proposed by Donoho [7]- [9]. In order to improve Donoho's method, some researchers afterwards focused on mother wavelet selection [10]- [12]. This is due to the fact that The associate editor coordinating the review of this manuscript and approving it for publication was Zhixiong Peter Li . different mother wavelets have different denoising performance, so an optimal wavelet needs to be selected. Also, some researchers tried to modify the thresholding value and function. In [13] and [14], second order continuous differentiable thresholding function is put forward, and an optimal thresholding value is to be obtained by gradient descent method. Particle swarm optimization (PSO) is used in [14] to search for wavelet thresholding value. These improved versions can perform better than the original method.
However, there is an inherent disadvantage in Donoho's method as well as its improved versions, for it is endowed with down-sampling process, which deprives its translation invariance and affects the accuracy of denoising results. To avoid this disadvantage, Coifman proposed a cycle spinning method, which shifts the measured signal pointwise and obtains its circular versions, then denoise and shift back each of them, finally the average is output as the result [15]. The method can obtain more accurate results than Donoho's method, but the computation cost is much greater.
In order to avoid great computation cost and ensure the translation invariance, it is proposed to select two independent real wavelets ψ r and ψ i , which constitute the real and imaginary part of a dual-tree complex wavelet, i.e. ψ c = ψ r + iψ i , where i 2 = −1 [16]. It is also pointed out that for a translation invariant complex wavelet, the scale filter h r of its real part and the scale filter h i of its imaginary part should satisfy 1/2 sample point delay condition [17], [18], i.e. h i (n) = h r (n−0.5). The translation invariant pair (ψ r , ψ i ) is also known as the Hilbert pair. However, the quadrature mirror filters (QMF) of orthogonal real wavelets are all finite impulse response filters (FIR), so this condition is unreachable and can only be approximate.
To keep approximate translation invariance as possible, Kingsbury and Selesnick et al. [16], [19], [20] pointed out that it is necessary to construct a dual-tree complex wavelet and use two individual wavelets to denoise the signal. However, sophisticated iterative computation is still needed for QMF construction. Different from Selesnick, Tay proposed another idea [21]- [23]. In Tay's method, two Db wavelets whose vanishing moments differ by two are selected to construct a dual-tree complex wavelet, which will be approximately translation invariant. In fact, wavelets of the same family can also be used for the construction.
In regular thresholding functions, each individual wavelet coefficient is thresholded independently, and the correlation between adjacent coefficients is not taken into consideration. To make full use of potential information between wavelet coefficients, Cai [24], [25] proposed a block thresholding method instead. In Cai's method, wavelet coefficients are divided into non-overlapping blocks. Each of the blocks is then regarded as a whole part and thresholded independently according to certain rules.
In order to achieve better PD signal denoising results, it is proposed to combine wavelet pair and block thresholding in this paper. A numerical computation is carried out and the results verify that the proposed algorithm outperforms the traditional method with better accuracy.
The rest of the paper is organized as follows: Section II describes traditional wavelet thresholding denoising method and its improvement. In Section III, the proposed improvements and the algorithm are described. Three wavelet families are introduced, and the best combinations are computed. Block thresholding is also compared with regular thresholding. A numerical case study is proposed in Section IV, and a conclusion is drawn in Section V.

II. WAVELET THRESHOLDING DENOISING METHODOLOGY
The observation model of PD signal can be expressed as: where f (n) is the measured noisy PD signal with length N , s(n) is the pure PD signal, w(n) is the standard normal Gaussian white noise with mean value 0 and standard deviation value 1, and σ is the standard deviation of the noise. The most commonly used wavelet thresholding in PD signal denoising is proposed by Donoho [26], [27]. The method consists of wavelet transform and thresholding, and it is simple to implement.

A. PRINCIPLES OF DONOHO'S METHOD
The detailed steps of Donoho's method are shown as follows: Firstly, QMF, i.e. scale filter h and wavelet filter g are utilized to decompose the signal of length N into K levels.
The approximation coefficients a j [p] and detail coefficients d j [p] of each level are obtained, where j is the level and p is the coefficient number. The decomposition procedure shown in (2) is also called the down-sampling procedure.
Secondly, standard deviation σ is estimated with (3), where MED is the median function: Then the thresholding value λ is to be set, and the thresholding function is to be selected. For example, the well known fixed thresholding value λ =σ √ 2 ln N , and soft thresholding function ρ S shown in (4). The detail coefficients d j [p] are processed by it.
Finally, inverse transform is applied, and the denoised PD signal is obtained. The reconstruction procedure is shown in (5), which is also called up-sampling procedure.
To improve the results of PD signal denoising, researchers mainly focused on mother wavelet selection. The idea is to preset some certain criteria and build a wavelet library for retrieval of the optimal candidate. A typical example is presented in [10], the authors proposed an energy based wavelet selection (EBWS) method, which defines an energy parameter E a as shown in (6), and the candidate in the library with the largest E a will be selected as the mother wavelet. In (6), J is the decomposition level, a is the approximation coefficients and d is the detail coefficients.
Similarly, a signal-to-noise ratio based wavelet selection (SNRBWS) method is proposed in [11], and a sub-band VOLUME 8, 2020 energy to entropy based wavelet selection (SBETEBWS) method is proposed in [12]. Energy to entropy ratio, sample entropy, and sparsity index are also utilized as criteria in optimal mother wavelet selection [28]- [30].
In fact, these methods are retrieval methods which only search for the optimal wavelet from existing ones, thus showing no essential difference from Donoho's method.

C. MODIFICATION OF THRESHOLDING VALUE AND FUNCTION
Another idea for improvement is to modify the thresholding value and function. One typical example is the thresholding function with second order continuous derivative proposed in [13] as shown in (7), where k is a non-negative integer, and the thresholding value t is to be determined.
By minimizing the risk of Stein's unbiased risk estimation (SURE), the optimal thresholding value t can be obtained. The modified method is named as adaptive thresholding method, and it is able to improve the results of PD signal denoising.
However, there is down-sampling procedure in (2) for Donoho's method and its improved versions, so they are not translation invariant. Therefore, noisy wavelet coefficients may introduce errors to reconstructed signal [31]. In order to overcome this disadvantage of single wavelets, a dual-tree complex wavelet construction is necessary. Also, the correlation between adjacent wavelet coefficients is not taken into account by the regular thresholding functions.

III. PROPOSED IMPROVEMENTS AND ALGORITHM A. BEST PARTNER FOR SINGLE WAVELET
Inspired by Tay's idea, we decide to use wavelet pairs of the same families to construct a dual-tree complex wavelet instead of single wavelet for PD denoising. In fact, the QMF of wavelets belonging to the same family have similar magnitude and phase properties, so they can be used to construct wavelet pairs. However, not two arbitrary wavelets constitute an approximate Hilbert pair, and the best combinations should be computed and compared. In this paper, a wavelet library is firstly constructed, which includes three wavelet families, i.e. Db wavelets, symmlets, and coiflets. Db wavelets vary from Db1 to Db15, symmlets vary from sym2 to sym15, and coiflets vary from coif1 to coif5, so there are 34 wavelets totally.
The quality parameters E 1 and E 2 proposed in [23] are adopted in this paper to measure the approximation quality of the wavelet pairs. For a complex wavelet ψ c , its quality parameters are defined as: Smaller quality parameters stand for better approximate translation invariance. For any single Db wavelet, its best partner is the Db wavelet whose vanishing moment differs by two from its own [23]. For symmlets and coiflets, the best partner of a single wavelet is obtained by computing the quality parameters E 1 and E 2 , as shown in Table. 1.
Therefore, once a single wavelet is determined as the mother wavelet, its best partner can be found in Table. 1. Then, their QMF can be simply obtained by QMF look-up table, and no extra scale/wavelet filter construction process is needed.

B. BLOCK THRESHOLDING VERSUS REGULAR THRESHOLDING
For PD signal denoising, researchers mainly utilized soft thresholding function or hard thresholding function proposed by Donoho [32], [33]. The soft thresholding function is a kill or shrink rule, and hard thresholding is a kill or keep rule. For soft thresholding function, when the absolute value of a wavelet coefficient is large enough, it is shifted by an amount of λ towards zero, which may bring unnecessary bias for reconstructed signal. For hard thresholding function, its discontinuity may lead to large variance.
From another point of view, denoising can also be deemed as the procedure which multiplies each wavelet coefficient by a shrinkage factor, i.e. (4) can be rewritten as (9): where β is the shrinkage parameter larger than 0, and c j is the shrinkage factor. When wavelet coefficient is smaller than the predetermined threshold value λ, the corresponding shrinkage factor c j is set to 0, otherwise it is equal to a positive number smaller than 1. Soft thresholding is a specific case when β is equal to 1. Similarly, hard thresholding is also a specific case when β approaches positive infinity. β = 2 is another case known as James-Stein thresholding, which is able to remedy the drawbacks of soft and hard thresholding functions to some degree. It can be seen from (4) that regular thresholding functions process each wavelet coefficient independent of the adjacent coefficients. In fact, pure PD signal is smooth in both time and wavelet domain, whereas white noise still keep white in wavelet domain. Correlation between adjacent coefficients can be used to determine the threshold values. Block thresholding can increase the accuracy by pooling information about neighboring coefficients to make a thresholding decision [34]. Therefore, it is suggested in this paper that block thresholding should be employed.
In block thresholding method, block length L is firstly determined, and then wavelet coefficients on each level are divided into non-overlapping blocks [32]. Therefore, the m-th block on level j can be expressed as: by denoting S 2 j,m = j,k∈B j,m d 2 j,k , the wavelet shrinkage factor of B j,m is: whereσ is the estimated noise standard deviation. β, λ and L are all parameters to be determined. Obviously, their values will influence the denoising result, so optimization of them is needed. Based on James-Stein estimation and SURE rule, it is deduced that minimum denoising error of block thresholding method can be reached if and only if (12) holds: where N is the length of the detail coefficients on level j. Substitute (12) into (11) and shrinkage factor can be obtained.

C. THE PROPOSED ALGORITHM
In this paper, wavelet pair and block thresholding are combined, and the proposed denoising algorithm is stated as follows: Firstly, a single wavelet M is selected according to some certain rule such as EBWS, and the wavelet pair (wavelet M , wavelet N ) is selected to construct a dual-tree complex wavelet and decompose the measured noisy PD signal; Secondly, independent block thresholding is respectively applied to real and imaginary part of decomposed signal; Finally, inverse transform is applied to the real and imaginary part, and they are averaged to be the final result. The denoising algorithm schematic flow chart is shown in Fig. 1.

IV. CASE STUDY OF NOISY PD DENOISING
In this section, the mathematical models of four typical PD signals described in [35] are adopted, i.e. the ultra high frequency (UHF) PD signal in gas insulated switchgear (GIS) with sulfur hexafluoride (SF 6 ) as the medium. The PD signals are respectively denoted as G-type PD signal, M-type PD signal, N-type PD signal, and P-type PD signal for short.
For sake of simplicity, four normalized pure PD signals are merged into one PD signal. The waveform of it and the noisy version (σ = 0.5) are shown in Fig. 2, where the abscissa is the sampling point; the ordinate is the normalized amplitude A * , and the unit is mV.

A. COMPUTATION RESULTS OF THE PROPOSED ALGORITHM
The decomposition level is another topic in wavelet transform, and there are also relative researches [36], [37]. However, usually a reasonable decomposition level is enough. Therefore, the decomposition level is set to be 5 in this paper.
EBWS is utilized as the criterion to determine the optimal single wavelet, and the criterion values of each wavelet on the noisy PD signal is computed, as shown in Fig. 3, the abscissa is the wavelet number, and the ordinate is the EBWS value. Db1 to Db15 are numbered as 1 to 15, sym2 to sym15 are   Db10 wavelet is selected accordingly, and (Db10, Db8) is selected as the wavelet pair.
For block thresholding, the parameters are selected as mentioned in Section III. When the standard deviation σ is equal to 0.5, noisy signals shown in Fig. 2 are processed by the proposed algorithm in this paper.
The denoised signal is shown in Fig. 4. It can be seen that intuitively the proposed algorithm performs well to suppress the white noise in PD signal.  To quantitatively verify the performance of the proposed algorithm, two parameters are adopted for evaluation, i.e. signal to noise ratio (SNR) and normalized correlation coefficient (NCC). SNR is used to measure energy ratio of pure PD to noise, and it is defined by (13) (13) where s is the pure PD signal, f is the noisy signal or denoised signal.
NCC is a parameter to measure the linear correlation between two signals, and it can be used to measure the similarity between PD signal before and after denoising. It is defined by (14): where x is the pure PD signal, y is the noisy signal or denoised signal. x and y are their mean values. The larger SNR and NCC, the better denoising performance. The SNR and NCC before/after denoising are shown in Table. 2, it can be seen that for the PD signal, the SNR increases nearly by 15, and NCC increases nearly by 0.55.

B. COMPUTATION RESULTS COMPARISON BETWEEN DIFFERENT COMBINATIONS AND METHODS
In order to compare the performance of different combinations and methods, Gaussian white noise with different standard deviations are then added to pure PD signal. Set a large range of standard deviation, and noisy PD signals can be obtained with different values of SNR. The noisy PD signals are employed for computation. The SNR values of original noisy PD signals are set as {5,0, −5}, and the corresponding NCC values are also computed. Noisy PD signals with different SNR are respectively denoted as case1, case2, and case3.
Under each case, for the proposed algorithm there are four combinations to be discussed and compared, i.e. single wavelet/regular thresholding, single wavelet/block thresholding, wavelet pair /regular thresholding, and wavelet pair/block thresholding.
Meanwhile, the proposed algorithm is also compared with the SVD method and the adaptive thresholding method. In this case, the EBWS criterion values of each wavelet on the noisy PD signal is firstly computed, as shown in Fig. 5.
Db15 wavelet is selected accordingly, and (Db15, Db13) is selected as the wavelet pair. Heuristic SURE threshold values and soft thresholding function are utilized for regular thresholding function. For adaptive thresholding, the threshold values are selected as mentioned in Section II. For block thresholding, the parameters are selected as mentioned in Section III.
After computation, the SNR and NCC by different combinations and methods are shown in Table. 3.  The results show that among four combinations, the proposed wavelet pair/block thresholding performs best.
Also, wavelet pair/regular thresholding outperforms single wavelet/regular thresholding with larger SNR 0.75dB and NCC 0.001; single wavelet/block thresholding outperforms single wavelet/regular thresholding with larger SNR 0.54dB and NCC 0.001.
Meanwhile, the proposed wavelet pair/block thresholding also outperforms SVD and adaptive thresholding, with larger SNR 1.17dB and 0.48dB respectively, and larger NCC 0.004 and 0.0005 respectively.
2) CASE2:SNR = 0 In this case, the EBWS criterion values of each wavelet on the noisy PD signal is firstly computed, as shown in Fig. 6.
Sym7 wavelet is selected accordingly, and (sym7, sym9) is selected as the wavelet pair. Heuristic SURE threshold values and soft thresholding function are utilized for regular thresholding function. For adaptive thresholding, the threshold values are selected as mentioned in Section II. For block thresholding, the parameters are selected as mentioned in Section III.
After computation, the SNR and NCC by different combinations and methods are shown in Table. 4. The results show that among four combinations, the proposed wavelet pair/block thresholding performs best.
Also, wavelet pair/regular thresholding outperforms single wavelet/regular thresholding with larger SNR 0.54dB and NCC 0.002; single wavelet/block thresholding outperforms single wavelet/regular thresholding with larger SNR 0.24dB and NCC 0.002.
Meanwhile, the proposed wavelet pair/block thresholding also outperforms SVD and adaptive thresholding, with larger SNR 1dB and 0.52dB respectively, and larger NCC 0.0045 and 0.0019 respectively.

3) CASE3:SNR = −5
In this case, the EBWS criterion values of each wavelet on the noisy PD signal is firstly computed, as shown in Fig. 7.
Sym9 wavelet is selected accordingly, and (sym9, sym7) is selected as the wavelet pair. Heuristic SURE threshold values and soft thresholding function are utilized for regular thresholding function. For adaptive thresholding, the threshold values are selected as mentioned in Section II. For block thresholding, the parameters are selected as mentioned in Section III.
After computation, the SNR and NCC by different combinations and methods are shown in Table. 5. The results show that among four combinations, the proposed wavelet pair/block thresholding performs best.
Also, wavelet pair/regular thresholding outperforms single wavelet/regular thresholding with larger SNR 0.55dB and NCC 0.006; single wavelet/block thresholding outperforms single wavelet/regular thresholding with larger SNR 0.17dB and NCC 0.006.
Meanwhile, the proposed wavelet pair/block thresholding also outperforms SVD and adaptive thresholding, with larger SNR 1dB and 0.47dB respectively, and larger NCC 0.01 and 0.005 respectively.
From the computation results under three cases, it can be seen that all combinations perform well, but wavelet pair outperforms single wavelet for any combination. Also, block thresholding outperforms regular thresholding for any combination. That indicates the positive improvements to wavelet thresholding by wavelet pair and block thresholding.
Meanwhile, the proposed algorithm also outperforms the SVD method and the adaptive thresholding method.

C. COMPUTATION COMPLEXITY AND TIME COMPARISON
The complexity of all the denoising methods should also be addressed. For single wavelet/regular thresholding, its computing process involves a down-sampling procedure, a thresholding procedure, and a up-sampling procedure; for single wavelet/block thresholding, it involves an extra block dividing procedure, and the other procedure is the same as single wavelet/regular thresholding. For wavelet pair combinations, they consist of two independent single wavelets computing process.
For a PD signal of length N , and the wavelet pairs QMF of length L fa and L fb , the total operation of single wavelet /regular thresholding is 31N * L fa /2−15N /16+5; the total operation of single wavelet/block thresholding is 31N * L fa /2+91N /16+5−lnN ; the total operation of wavelet pair/regular thresholding is 31N * (L fa + L fb )/2−15N /8+10; and the total operation of wavelet pair/block thresholding is 31N * (L fa + L fb )/2+91N /8+10−2lnN . It can be seen that the complexity of all combinations are all O(N ). Therefore, the proposed algorithm has the same computation complexity as Donoho's method, and their computation time will not differ too much from each other.
In the SVD method, matrices with large size need to be decomposed, and in the adaptive thresholding method, optimal thresholding value searching needs iteration. Therefore, both methods are excessively time-consuming. The computation time of all four methods are shown in Table. 6. The computation is performed on a PC under Windows 10 64-bit with CPU 2.4 GHz, and MATLAB2016 64-bit software.
It can be seen that the computation time of the proposed algorithm is only a little larger than that of Donoho's method, and it is much smaller than that of SVD method and adaptive thresholding method. Therefore, the computation time difference between the proposed algorithm and Donoho's method can be ignored.

V. CONCLUSION
In this paper, wavelet pairs are utilized to form real and imaginary part of a dual-tree complex wavelet, and block thresholding is also used to denoise noisy UHF PD signal corrupted by white noise. Wavelets belonging to the same family are combined to construct dual-tree complex wavelet with approximate translation invariance, and the best partners of any single wavelet are also computed.
Based on the calculation results from the numerical study, it can be concluded that different combinations have different performance, wavelet pair outperforms single wavelet, even though the thresholding combination may be different; the block thresholding outperforms regular soft thresholding; the proposed algorithm combined by wavelet pairs and block thresholding has the best performance among all combinations. In addition, the proposed algorithm performs better than other existing denoising method.
Meanwhile, although the proposed algorithm may take more computation time than Donoho's method, their complexity is of the same order, and both of their computation time are much smaller than other existing denoising method. Therefore, the proposed algorithm can be considered for utilization in practical engineering.