An Improved Algorithm for Peak Detection Based on Weighted Continuous Wavelet Transform

Peak detection is a crucial preprocessing step in the analysis of various spectral signals. The method based on the continuous wavelet transform is more practical and popular, and has better detection accuracy and reliability because it identifies peaks across scales in the wavelet space and implicitly removes noise as well as the baseline. However, there are inevitably overlapping peaks in the measured spectra, and the formed composite ridges affect peak detection accuracy. Most peak detection methods have limited applicability to overlapping peaks. A weighted continuous wavelet transform (WCWT) peak detection algorithm is proposed to improve the adaptive ability of the peak detection method. This method yields more obvious spectral peak characteristics in low-scale regions. Composite ridges can be successfully truncated by setting a noise threshold based on the standard deviation of the spectral signal. In addition, the maximum value in the ridges was compressed and shifted to a smaller scale, which could determine the peaks more accurately. The method was applied to the peak detection of simulated spectra, Romanian database of Raman spectra, and real liquid electrode glow discharge spectra. The results show that the proposed method exhibits good peak detection performance.

The associate editor coordinating the review of this manuscript and approving it for publication was Norbert Herencsar .
These methods have positive significance for the detection of spectral peaks.
Peak detection methods based on the continuous wavelet transform (CWT) have received significant attention in recent years because they exhibit the advantages of flexibility, multi-resolution, and easy implementation. Du et al. [13] employed the CWT for peak detection, which improved the accuracy and reliability of detection by identifying crossscale peaks in the wavelet space and implicitly removing noise and baselines. Zhang et al. [14] developed a multiscale peak detection method that takes full information of the ridges, valleys, and zero-crossings in the CWT coefficient matrix to improve peak detection accuracy. Zheng et al. [15] presented an improved method that combined CWT with a crazy climber algorithm to identify peaks by the position of ridges and achieved better performance in overlapped peak detection. Liu et al. [19] improved the mother wavelet to shorten its linewidth, which was applied to identify Raman spectral peaks and achieved good results. Zheng et al. [10] explored a method that combined CWT and curve fitting to VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ reduce the influence of noise. However, curve fitting reduced the peak detection efficiency.
In the CWT peak detection algorithm, peaks are usually obtained by searching for ridges or valleys in the coefficient matrix, where the ridges are determined from the local maxima in the search window. However, in the CWT coefficient matrix, owing to the presence of overlapping peaks, the ridges merge to form composite ridges at larger scales. In addition, the ridge is not a straight line because of the influence of the adjacent peaks and noise on the spectra. The maximum value of the ridge, particularly that of the composite ridge with overlapping peaks, is typically located on a larger scale. The maximum value of the ridge on a larger scale cannot correctly determine the peak position, which affects peak detection accuracy.
As a result, the peak detection method that can effectively identify overlapping peaks and reduce false peaks requires further investigation. In this study, an improved method that introduces a weighting function to the mother wavelet, that is, the weighted continuous wavelet transform (WCWT) peak detection method is developed, and the selection of the wavelet function and scale parameters is described in detail. The noise threshold is defined by the standard deviation of the spectral signal, which can effectively truncate compound ridges. Moreover, the maximum value of the wavelet transform coefficient is compressed and shifted to a smaller scale to determine the peak position more accurately. In general, the improved peak detection method inherits the advantages of the continuous wavelet transform method and exhibits better peak detection performance.

A. CONTINUOUS WAVELET TRANSFORM
The wavelet transform is proposed based on the short-time Fourier transform, which can simultaneously achieve a local transformation in the time and frequency domains. Therefore, CWT has the outstanding property of focusing on local details of the signal. The CWT is widely used in signal processing, such as discontinuity and chirp signal detection [21], which allows wavelet transforms at every scale. The scales were determined by weighing the need for detailed analysis [13]. In particular, CWT makes the peak information in the spectral peaks more obvious by redundant continuous translation at each scale. Mathematically, the CWT is represented by Eq. (1): where f (t) is the signal, a is the scale parameter, b is the translation parameter, and w(t) is the wavelet mother function, w a,b (t) is the scaled and translated wavelet and C is the two-dimensional coefficient matrix that reflects the similarity between the wavelet functions and the signal. CWT can be regarded as the convolution of the signal and wavelet functions on a certain scale. The larger C(a, b) is, the greater the similarity of the signal to the wavelet function w a,b (t) is. Therefore, the spectral peak positions can be estimated by using the ridges formed in the wavelet space from the local maxima of the wavelet transform coefficients.

B. WAVELET MOTHERFUNCTION AND WEIGHTING FUNCTION
The choice of wavelet mother function mainly depends on its function symmetry and waveform. The background infor-mation of the spectral lines can be effectively suppressed by selecting a wavelet mother function with symmetric properties for CWT. In addition, the selection of the wavelet mother function requires that its shape be similar to the peak shape of the peak-seeking spectrum. The Mexican hat wavelet is a mother wavelet commonly used in wavelet transforms for peak detection. This is proportional to the second derivative of the Gaussian probability density function, which has the main characteristics of a spectral peak shape, including symmetry, a maximum peak, and an approximately Gaussian shape. The mathematical representation of the Mexican hat wavelet is given by Eqs. (2): The second derivative Gaussian wavelet (Gaus2 wavelet) is also derived from the Gaussian function and exhibits a mathematical form and peak characteristics similar to those of the Mexican hat wavelet, which can be defined by Eqs. (3): The resolution of the CWT is controlled by the scale parameter a of the mother wavelet. As the scale increases, the resolution decreases. Wavelet transform on a small scale can achieve better resolution of overlapping peaks because of the smaller half-width of the wavelet. The Gaus2 wavelet and Mexican hat wavelet with the same scale are shown in Figure 1. The Gaus2 wavelet has a smaller half-width, which can obtain a better resolution of overlapping peaks in the wavelet space.
The coefficient a 1/2 in Eq.(1) ensures the energy conservation of the wavelet function w a,b (t) and the mother function w(t) at any scale a, which is an important characteristic of many wavelet-based algorithms such as data compression. For peak detection algorithms, the constraint of energy conservation can be ignored, allowing the targeted selection of wavelets with specific properties, thereby tailoring the wavelets according to the shape of the data peaks and the desired sensitivity for overlapping peak detection [16]. In other words, by allowing the wavelet transform calculation to be customized for a particular type of data, the weighting function g(a) can be introduced into the mother wavelet of CWT. For the Gaus2 wavelet employed in this study, the  modified wavelet is given as follows: The weighting function selected is g(a)=1/e a . As the scale parameter a increases, the wavelet transform becomes weaker. This is necessary for the truncation of the composite ridges by setting a noise threshold.
To evaluate the performance of the Gaus2 wavelet and weighting function for peak detection, a fully overlapping peak consisting of two peaks with the same amplitude and width can be simulated as expressed in Eq.(5): The circles in Figure 2 represent the local maximum coefficients at each scale in the wavelet space, and the resulting curve is defined as a ridge. Figure 2(a) shows the simulated overlapping peaks with two peaks. Figures 2(b) and (c) show the local maxima obtained from the Mexican hat and Gaus2 wavelet, respectively. In general, the halfwidth of a wavelet increases with the scale. The resolution of the overlapping peaks decreases with the increasing scale. Owing to its smaller half-width, the Gaus2 wavelet can easily distinguish the overlapping peaks in the wavelet space. Therefore, the Gaus2 wavelet is chosen for follow-up, except for special instructions. Figure 2(d) shows the coefficients obtained from the Gaus2 wavelet through the WCWT, which has a smaller scale range. The standard deviation (SD) of the spectral signal is used to determine the noise threshold. For the WCWT method, as the scale increased, the transform strength weakened, and the transform coefficient accordingly decreased. The local maxima of the coefficient matrix are identified using the local window search method [15], and the coefficient matrix is filtered through the defined noise threshold. After filtering, only the local maxima at small scales were retained and the composite ridges of the overlapping peaks were truncated. The noise threshold is 0.001 times the standard deviation of the spectral signal.
As shown in Figure (3), the WCWT method retains the transformation characteristics well at a smaller scale, making the peak characteristics in the low-scale region more obvious. Meanwhile, the maximum at the ridge position is compressed and migrated to a smaller scale position. The effects of adjacent peaks and noise on the peak ridges at small scales are weak, and the search for the maximum value of the ridges at small scales is more accurate for the reflection of the peak position. Therefore, it is ideal to introduce the weighting function g(a) for the peak detection of overlapping and weak peaks using wavelet transform.

C. SCALE PARAMETER SELECTION
The selection of scale parameters is important for the wavelet transform peak detection algorithm. The range of scale parameters is large in the traditional CWT [13], [14], [15], [16], [17], [18], [19], [20]. If the scale parameter has a large range, the number of calculations will increase without improving accuracy. If it is small, the accuracy of the peak detection will be affected. The smaller the scale, the weaker are the detected peaks. The selection of the minimum scale directly affects the identification ability of the weak peaks.
Compared with the scale parameters in the following two situations, a = i and a = 1.18 i−1 , when the integer i is increased from 1 to 20, and a = 1.18 i−1 takes to two decimal places. i = 20, 1.18 i−1 ≥ 20, so 11.8 is chosen as the logbase. An overlapping peak is simulated using Eq. (6), and the local maximum can be obtained using the WCWT, as shown in Figure 4. The WCWT preserves only local maxima at smaller scales.
The overlapping peaks in Figure 4 (a) and (b) have fewer peak data points with an interval of one between the data points. As shown in Figure 4(a), when the scale is a = i, the peak on the left exhibits only two local maxima over the entire scale, which cannot lead to a useful peak ridge. The length of the ridge is usually greater than or equal to 3 [13]. When the scale is a = 1.18 i−1 , seven local maxima exist, as shown Figure 4(b). The smaller the distance between scales, the better the identification of overlapping and weak peaks. More scale parameters should be set in intervals with smaller values and fewer scale parameters should be set in intervals with larger values. This method can reduce the algorithm redundancy and improve the operational efficiency. As shown in Figure 4(c), when the interval is 0.5 and the scale is a = i, the peak on the left exhibits six local maxima, which can form a useful peak ridge.
In summary, for spectral information obtained from high-resolution instruments, a scale with equal spacing is sufficient, because spectral peaks are formed based on more data points. However, for spectra measured using portable spectrometers with lower resolution, fewer data points formed peaks. Therefore, it is necessary to select scale parameters with small spacing for accurate peak detection. The resolution of the overlapping peaks decreases with an increase in scale, and a scale parameter with a small spacing can achieve better overlapping and weak peak resolution.

D. RIDGE IDENTIFICATION AND PEAK DETERMINATION
As mentioned previously, the peak ridges are closely related to the estimation of the peak positions. When extracting ridges, the full use of the wavelet space and valleys in the original spectrum can accurately estimate the location of the peaks. The first step in ridge identification involves obtaining the local maxima in the wavelet space. At each scale of the CWT coefficient matrix, the slide-window search method is employed to identify the local maxima [14] and to filter the coefficient matrix using a defined noise threshold SD. The search results formed a two-dimensional matrix of the local maxima. The valley is the local minimum value in the wavelet space that represents the start or endpoint of the peak position. The local minimum value can also be determined by searching for a two-dimensional wavelet coefficient matrix.
Subsequently, the minimum scale is selected as the initial scan scale, and the local maximum value position under this scale is used to draw the points. Scan the next scale to find the closest value for each scan position and then add each scan position to the closest ridge. If a new maximum value appears, then a new connection starts as the root. Traverse all scales and complete the search for the maximum values.
Local minima in the wavelet space are used to determine the start and end points of the estimated peak positions. The optimal coordinates of the ridge are located in the valley coefficient matrix. The starting point is the closest local minimum on the left side of the optimal coordinate and the  endpoint is the closest local minimum on the right side of the optimal coordinate. The optimal coordinates are the ridge coordinates with the highest occurrence in the wavelet space. Search for the maximum value of the coefficient between the start and end points. This maximum value corresponded to a peak on the relative ridge.
After obtaining all the peaks from the above processes, the unfiltered false peaks in the previous steps were removed by thresholding the maximum value on the ridge line [18]. Meanwhile, it is essential to ensure that the SNR of these peaks is greater than or equal to three for better removal of false peaks.

A. SIMULATED SPECTRA
To demonstrate the peak detection performance of the proposed method, a simulated spectrum was formed by combining several Gaussian peaks and Gaussian white noise (20 dB) was added to the simulated spectrum. The Gaussian peak can be obtained using Eq. (7): where f is a function of the variable t, e is the natural constant, H is the peak height, c is the peak position, and σ is the standard deviation proportional to the peak width. To fit various types of spectra, the synthesized spectra contains strong, weak and overlapping peaks with different peak widths. The specific parameters of these peaks are listed in Table 1.

B. SPECTRAL DATASETS
Raman spectroscopy is performed when the target sample is excited by a single-wavelength laser, and the generated spectrum possesses ''fingerprint'' information, which can be used for the identification of the substance. Information on the abundant material structure is usually manifests as peaks in Raman spectra. Thus, peak detection is crucial for obtaining information in the spectrum. The Romanian database of Raman spectra (RDRS) contains raw spectra, manually annotated peak information from mineral samples, crystal structures, sample images, sample origins, and vibrations, which can be obtained from http://rdrs.uaic.ro/. Therefore, this database was used to evaluate the peak detection methods in this study.
Laser-induced breakdown spectroscopy (LIBS) datasets were also used to evaluate the experimental results. As a fast chemical analysis technique that enables remote detection, it has been widely adopted owing to its efficient and fast analysis and wide coverage of elements. ChemCam is a LIBS device carried by Curiosity (United States), which landed on Mars in 2012. This was the first LIBS device used for the planetary exploration. The LIBS spectrum and data were obtained from the National Aeronautics and Space Administration Planetary Data System (NASA PDS) and collected using ChemCam and its backup prototype.

C. REAL LEGD SPECTRA
To verify the reliability and practicability of the WCWT, real liquid electrode glow discharge (LEGD) spectroscopy was used as experimental data. LEGD is an atmospheric pressure glow discharge. During the discharge process, the metal ions dissolved in the solution entered the plasma and are converted into neutral metal atoms via high-voltage discharge. The metal atoms are then excited to the excited state and energy is emitted in the form of a characteristic spectrum during the transition from the excited state to the ground state. The composition and concentration of the metal elements in the measured substance can be obtained by analyzing the spectrum to qualitatively and quantitatively determine the metal ions in the solution. This plasma has the advantages of small size, convenient portability, low cost, low excitation power, and no need for inert gas [22], [23].

A. PERFORMANCE ON RIDGES IDENTIFICATION
The experimental data for ridge identification were simulated spectra in which 12 peaks were generated, as shown in Figure 5(a). Figure 5(b) shows an image of the CWT coefficients. It was found that with an increase in scale, the wavelet coefficient increased, the resolving power decreased, and the overlapping peaks were completely merged at larger scales. As shown in Figure 5(c), through the WCWT, the spectral peak features were compressed at larger scales and became clear at lower scale, which facilitated the determination of the positions of the weak peaks and overlapping peaks.
At larger scales, the coefficient maxima of the overlapping peaks merged to form composite ridges. Figure 6 shows the filtered local maxima using the noise threshold. The local maxima obtained using the CWT method are shown in Figure 6(a). It can be seen that the composite ridges are detected at the four positions marked by the red arrows, and the composite ridges will mistakenly introduce false peaks. As shown in Figure 6(b), composite ridges can be truncated using the WCWT method. Although the position marked by the red ellipse in the figure exhibits an obvious curvature, the maximum value on the ridge is compressed and migrated to a small-scale parameter position with higher resolution via the weighting route. As a result, overlapping peaks can be readily identified by the WCWT. Furthermore, the scale parameter range of the WCWT decreases, however, it is not equivalent to intercepting a small scale parameter because each peak position corresponds to a different scale parameter range.

B. PEAKS DETECTION RESULT AND COMPARISONS
To evaluate the peak detection results obtained using the WCTW method, the publicly available LIBS dataset and RDRS were employed, and a comparative analysis was performed with other methods, including multiscale peak detection (MSPD) [14] and MassSpecWavelet [13].
The traditional receiver operating characteristic (ROC) curve was adopted as the standard to evaluate the peak detection results. If the determined position exceeded the given error margin of the true position, the corresponding peak was considered false. The false detection rate (FDR) and sensitivity (true positive rate, TPR) of ROC can be calculated using the following equations: where TP is the number of peaks detected within the true peaks. FN is the number of peaks that are not detected by the algorithm. P is the total number of true peaks. FP is the number of detected false peaks. ROC curves were obtained using different peak detection methods by plotting a series of relationships between the TPR and FDR under different parameter settings. For the same FDR obtained by different peak detection methods, a larger TPR corresponds to better performance of the method. The ChemCam carried by the Curiosity Mars Rover has obtained a large number of LIBS spectra from the surface of Mars, including almost 1000 different rock and soil targets, to accurately analyze the elemental composition and spectral characteristics of samples from different targets. ChemCam has three 2048 channel spectrometers covering the wave-length range from 240.1 to 342.2 nm, 382.1 to 469.3 nm, and 474.0 to 906.5 nm. It uses a 1067 nm Nd: KGW laser with a 350 µm spot size, 5 ns laser pulse width, and 14 mJ pulse energy. When the sample was placed at a distance of 1.5 m, ChemCam emitted 50 laser pulses at five different locations on the sample and collected the spectra. In this study, the second spectral data of 50 samples were selected and analyzed using WCWT, MSPD, and MassSpecWavelet. The WCWT method uses the Gaus2 wavelet and Mexican hat wavelet as the mother functions. The ROC curve was used to evaluate the performance of the aforementioned methods. The SNR of the MassSpecWavelet ranged from 0 to 13, and the threshold values of the WCWT and MSPD methods ranged from 0.001 to 1. The ROC curves of the three methods for the LIBS dataset are shown in Figure 7(a). The MSPD and WCWT performed better than MassSpecwavelet. Overall, the TPRs of the WCWT and MSPD were much higher at all FDRs than those of MassSpecWavelet. Compared with MSPD, the performance of WCWT is superior, and the WCWT method with the Gaus2 wavelet as the mother function has the best peak detection performance.
Random noise, fluorescent baselines, overlapping peaks, and peak-dense regions coexist in the Raman spectra of RDRS, which leads to significant challenges for peak detection methods. Sixty Raman spectra were selected, and TPR and FDR were recorded at different thresholds. Thresholding by the maximum ridge for the WCWT and MSPD was selected from 0.001 to 0.5. To remove false peaks better, the SNR of the WCWT method was set to 3. The SNR values for the MassSpecWavelet method were varied from 0 to 20. The ROC curves of the three methods for the RDRS dataset are shown in Figure 7(b). The WCWT method also uses the Gaus2 wavelet and Mexican hat wavelet as the mother function. The TPRs of the WCWT were larger for all FDRs than those of MSPD and MassSpecWavelet. The WCWT is more stable, particularly when the FDR is small, which means that the WCWT can identify more true peaks when FDR remains low. The excellent results of the WCWT in Raman spectroscopy indicate that this method exhibits favorable commonality compared with the other two methods. For the WCWT method, the Gaus2 wavelet with a smaller half-width exhibited a better peak detection performance than the Mexican hat wavelet.
To evaluate the peak detection performance of the different methods further, the F 1 measure was employed as another criterion. The F 1 metric is a trade-off between the false discovery rate and sensitivity to quantify the algorithm performance [8]. A larger F 1 leads to a better peak detection performance. The F 1 measure is defined by Eq. (10):    Table 2 shows the optimal F 1 measure obtained using the three peak detection methods and the corresponding TPR and FDR values for the RDRS. The F 1 values of the WCWT method for both wavelet cases are higher than those of the other two methods, illustrating the applicability of this method. This also means that the WCWT can achieve peak detection with a high true positive rate and a low false detection rate. In addition, the WCWT method using the Gaus2 wavelet with a smaller half-width has better peak detection performance than the Mexican hat wavelet.
Although MassSpecWavelet, MSPD, and WISPD all have the feature of continuous wavelet transform, they can avoid the interference of noise and the baseline to some extent. These methods exhibit different performances for weak and overlapping peak detection and false peak removal. As shown in Figure 8(c), for the Raman spectra, the detection performance of MassSpecWavelet using SNR to identify peaks is poor. As shown in Figure 8(b), the MSPD method can detect partially overlapping peaks because of the full use of the peak information in the wavelet space, which still causes the missed detection of some weak overlapping and weak peaks. Figure 8(a) shows the peak detection results of the WCWT method, which can detect each peak among the overlapping peaks and mark the weak peaks with a box. Clearer low-scale spectral peak characteristics were obtained using the WCWT. In addition, overlapping and weak peaks can be easily identified by selecting the Gaus2 wavelet with a smaller linewidth and scale parameter with small spacing. The only unsatisfactory result was that the broad peak indicated by the right oval mark in Figure 8(a) was detected as an overlapping peak with two peaks, which may indirectly reflect the strong overlapping peak resolution ability of the WCWT method.

C. RELIABILITY AND PRACTICABILITY
LEGD spectra were acquired using a miniature fiber optic spectrometer (AvaSpec-ULS3648, Avantes, Netherlands), and the liquid glow discharge device was described in detail in a previous study [24]. The simulated water samples contained Zn, Cd, Cu, Pb and Na ions. Figure 9(a) shows the peak position information obtained by the WCWT method, which can accurately detect eight spectral lines of the five metal elements in the simulated water sample, including Zn (213.7 nm), Cd (228.8 nm), Cu (324.7 and 327.4 nm), Pb (368.4 and 405.8 nm), and overlapping peaks of sodium element at 589.0 and 589.6 nm. In addition, the molecular band spectra of OH (nm) and N 2 (315-406 nm) as well as the atomic lines of H β (486.1 nm) and H α (656.5 nm) were accurately detected. H β and H α are two important basic data for calculating plasma parameters in LEGD.
The plasma excitation temperature was measured using the relative intensities of H β and H α [25]. Plasma electron density was calculated using the stark-broadened profiles of the H β lines. Figure 9(b) shows the peak-finding results of the MSPD method. It can be seen that although most of the spectral peaks can be well identified, the overlapping peak of sodium (589.6 nm) cannot be identified. The WCWT method provides clearer spectral peak characteristics by weighting and simultaneously selecting a scale parameter with a small spacing and a wavelet function with a small linewidth, which can effectively identify overlapping peaks. The results show that the WCWT method has better reliability and practicability for processing the LEGD spectra.

V. CONCLUSION
In this study, an improved peak detection method based on the weighted continuous wavelet transform is proposed. Simultaneously, scale parameters with a small spacing and a Gaus2 wavelet with a small line width were selected. The WCWT can weaken the intensity of the transform at larger scales and cause low-scale regions to exhibit more obvious spectral peak characteristics. Setting a noise threshold based on the standard deviation of the spectral signal effectively truncates the complex ridges. In addition, the WCWT combines ridge and valley information in the matrix for better identification of the spectral peaks. The simulated spectral results showed that the WCWT achieved better ridge identification and peak finding accuracy. The detection results of the Romanian database of Raman spectra in this study show that the WCWT method can attain a high TPR while maintaining a low FDR and has better peak detection performance than MSPD and MassSpecWavelet, especially for the detection of overlapping peaks. The practical application of the WCWT method in LEDG spectroscopy showes good reliability and practicability.
YONGJIE ZHOU received the B.S. degree in physical sciences from Lanzhou City University, Lanzhou, China, in 2011, and the M.S. degree in plasma physics from Northwest Normal University, Lanzhou, in 2014. He is currently pursuing the Ph.D. degree in computer science and technology with Qinghai Normal University, Xining, China. Since 2014, he has been a Lecturer with the Physics and Electronic Engineering Department, Qinghai Normal University. His research interests include low-temperature plasma discharge processes, liquid discharge, spectroscopic diagnostics, and algorithmic analysis.