Synthesis and Modification of Cetacean Tonal Sounds for Underwater Bionic Covert Detection and Communication

For most conventional bionic signal design methods, they cannot construct high-similarity bionic signals to match those complex cetacean sounds because they are only based on relatively simple bionic signal models. Besides, although very few methods based on the weighted signal superposition technology can construct high-similarity bionic signals, it’s very difficult to adjust relevant parameters to match different cetacean sounds or synthesize other desired bionic signals. To solve these problems, firstly, two bionic signal models are proposed individually to mimic cetacean sounds with a simple time-frequency (TF) structure, and then they are combined to mimic cetacean sounds with complex TF structures based on a designed piecewise construction strategy. Based on the two models, the parameters of the synthesized bionic signals can be adjusted to improve detection and communication performance of the bionic signals. The experimental results show that the Pearson correlation coefficient (PCC) results between 13 true cetacean sounds and their corresponding bionic signals are higher than 0.97, and 11 results of them are no less than 0.99. Four key performance indicators of a bionic signal are improved by more than 40% when bandwidth increases by 1kHz. Experimental results demonstrate that the proposed method cannot only efficiently imitate all kinds of simple and complex cetacean tonal sounds with high similarity, but also construct a variety of the same type of bionic signals by simply adjusting model parameters. In addition, the proposed method can also be applied to other areas, such as constructing a new cetacean sound database and so on.


I. INTRODUCTION
Owing to sending out signals actively, active sonar detection (ASD) and underwater acoustic communication (UAC) systems can easily be detected by the adversaries. In the last few decades, many methods have been proposed to improve the covertness of sonar signals and underwater communication signals [1]- [6].
Underwater bionic covert detection and communication is a novel approach to realize covert ASD and UAC. Its The associate editor coordinating the review of this manuscript and approving it for publication was Jingchang Huang . main idea is to disguise sonar or communication signals into cetacean sounds. During the identification of underwater monitoring systems, these bionic sonar and communication signals could be classified as ocean noise and filtered out [7]- [10], thereby achieving the purpose of covert ASD and UAC. As an approach with great potential, underwater bionic covert detection and communication has been attracting more and more attentions in recent years [11]- [22].
The design of bionic signals is the key to underwater bionic covert detection and communication. More specifically, bionic signals should meet the covertness and the validity requirements (e.g. communication rate, detection accuracy etc.) for covert ASD and UAC systems. However, the covertness of ASD and UAC systems depends on the camouflage ability of bionic signals.
The design of bionic signals includes signal synthesis and modification. To ensure the camouflage ability, in the synthesis of bionic signals, it is necessary to match the acoustic characteristics of bionic signals, such as waveform shape, frequency distribution and time-frequency (TF) distribution, to that of cetacean sounds as closely as possible. However, the bionic signals with high camouflage ability do not necessarily meet the validity requirements of ASD and UAC. Therefore, efficient signal models and corresponding signal construction methods are very important for efficiently and effectively imitating all kinds of simple and complex cetacean tonal sounds.
In addition, synthesis and modification of bionic signals also play an important role in other areas. For example, efficient synthesis and modification models and methods of bionic signals can be used to construct a new cetacean sounds database based on existing ones [23], and construct all kinds of bionic signals to simulate cetaceans and then evaluate the behaviour impact of different acoustic characteristics on cetaceans [24].
There has been some progress over the years in the design of bionic signals based on cetacean sounds. Some researchers use original cetacean sounds to construct bionic signals [11]. However, the original cetacean sound database is usually limited, and it is difficult to find original cetacean sounds that can meet the validity requirements of covert ASD and UAC. Due to this limitation, some researchers try to build suitable bionic signal models to mimic the original cetacean sounds [12]- [16], [24], [25].
Tonal sounds are a large and important subset of cetacean sounds, and they are produced by both toothed whales [26]- [28] and baleen whales [29], which are sister clades containing all extant whales. Although within cetaceans the acoustic characteristics of tonal sounds vary enormously, such sounds are broadly defined as frequency modulation (FM) signals [30], [31]. Furthermore, cetacean tonal sounds are usually characterized in terms of their timefrequency spectrograms (TFSs), which is usually referred to as the ''contour'' of a tonal sound [32].
Due to the wide distribution of tonal sounds as well as their diverse acoustic characteristics in terms of duration, frequency distribution and TF distribution, etc., the bionic signal models based on tonal sounds can meet different camouflage ability and validity requirements of covert ASD and UAC.
In order to meet the validity requirements, the bionic signal models should parameterize the tonal sounds, so that the parameters of the bionic signals can be conveniently adjusted according to the validity requirements of ASD and UAC. For the camouflage ability requirements, the bionic signal models should achieve high-similarity mimicry of various tonal sounds.
The first category [12], [24] is to use weighted signal superposition technology to synthesize bionic signals. The tonal sound is modeled as weighted superposition of harmonically related sinusoids, and single sinusoidal frequencies are estimated over the windowed data. Since the bionic signal is expressed as a signal consisting of a large amount of short data blocks, the ASD and UAC performance of the bionic signal can only be changed by modifying each data block, which is not practical.
The second category [13]- [16], [25] is to construct the bionic signal based on basic FM signal models. Chris Capus et al. proposed a bionic sonar signal model with a double down-chirp structure for bottlenose dolphin clicks [25], and obtained a high-similarity performance for bottlenose dolphin clicks with the double down-chirp structure. Ahmad E. et al. modeled dolphin whistles based on the basic FM signal model, such as linear frequency modulation (LFM) signal and hyperbolic frequency modulation (HFM) signal, and designed bionic signals carrying information bits for UAC [13], [14]. Liu et al. proposed to use a series of segmented LFM signals carrying digital information to mimic nonlinear frequency modulation (NFM) whistles to achieve bionic covert UAC [15]. In 2018, a bionic sonar signal model was proposed based on the HFM signal model, which realized the high similarity mimicry of false killer whale whistles and high-precision ASD [16]. Since the TF structures of these models are simple, it's difficult to achieve high-similarity imitation of cetacean sounds with complex NFM characteristics using these models.
In this paper, we propose two bionic signal models and one piecewise construction strategy for complex cetacean tonal sounds for covert ASD and UAC. By analyzing the contours of cetacean tonal sounds, it is found that the contours of cetacean sounds are similar to those of the power frequency modulation (PFM) signal and the sinusoidal frequency modulation (SFM) signal. Therefore, based on PFM and SFM signal models, two bionic signal models are presented, which can parameterize the characteristics, such as curvature, slope, frequency range and duration, of the tonal sound contours. Then, combined with the waveform envelope extraction method for time-domain signal and the piecewise construction strategy, high-similarity mimicry of various cetacean sounds is realized.
The main contributions of this paper can be summarized as follows: (1) Two bionic signal models and their piecewise construction strategy are proposed, which realize the high-similarity mimicry of most cetacean tonal sounds and some cetacean sounds with simple or complex TF structures.
(2) The proposed method parameterizes the acoustic characteristics of cetacean sounds, so that the parameters of the synthesized bionic signals can be felicitously modified to obtain high camouflage ability and good detection and communication performance.
(3) The proposed method can construct bionic signals, whose characteristics are similar to those of existing VOLUME 8, 2020 cetacean sounds. As a result, these constructed bionic signals can be applied to expand the existing cetacean sound database, and evaluate the behaviour impact of different acoustic characteristics on cetaceans.

II. PREPROCESSING AND ANALYSIS OF CETACEAN TONAL SOUNDS
In this paper, the tonal sounds of three common cetacean species, bottlenose dolphin, long-finned pilot whale and false killer whale, are taken as examples to illustrate the proposed models and method. The original high-quality cetacean tonal sounds were recorded with a 44.1ksps sampling rate. Considering that the tonal sounds were polluted by the Gaussian ocean ambient noise [33], a Wiener filter is utilized to remove the background noise of the recorded sounds, and the first 0.25 seconds of each original cetacean sound recording is utilized for a priori SNR estimation. By using the short-time Fourier transform (STFT) with a 1024-point (25ms) Hamming window with 60% overlap, the TFSs of the denoised tonal sounds are generated.
Tonal sounds are usually classified according to their contours [26]- [29], and due to the complexity and diversity of tonal species and sounds, a definition of the categories of tonal sound contours is usually specific for a certain cetacean species.
By studying various classification methods for tonal sounds, it is found that the classification method proposed by Bazúa-Durán and Au [26] has strong versatility, and could be applied to most tonal sounds. By using this classification method as a reference, most tonal sounds can be ascribed to one of the six categories according to their contours, as described in Table 1 and illustrated in Fig. 1, including constant frequency, upsweep, downsweep, concave, convex and sine. As shown in Fig. 1, the constant frequency tonal sound essentially has no change in frequency, which is similar to continuous wave (CW) signal. In contrast, the five categories of upsweep, downsweep, concave, convex and sine tonal sounds are different from each other, and they all have complex NFM characteristics. Therefore, in order to ensure camouflage of the bionic signals, the bionic signal models should match the NFM characteristics of different tonal sounds. Furthermore, as can be seen from Fig. 1, the NFM characteristics of different parts of a tonal sound may be different, and these characteristics can be divided into two categories: the first category is that the absolute value of the contour slope changes monotonically, such as monotonically increasing, which is similar to the PFM signal; the second category is that the absolute value of the contour slope firstly increases and then decreases, which is similar to the SFM signal. Therefore, in the next section, based on the PFM signal model and the SFM signal model, we propose two bionic signal models, and use multiple bionic signals with different NFM characteristics to construct different parts of a tonal sound.

III. BIONIC SIGNAL MODELS
As the idea in this paper is to mimic cetacean tonal sounds from the perspective of contour characteristics, the first step is to construct the TF expression of bionic signal based on the tonal sound contour, and the last step is to transform the TF expression into the time-domain waveform. If the TF expression of a FM signal s (t) is defined as f (t), the corresponding phase function φ (t) can be defied as Then, the FM signal s (t) can be expressed as where T is the duration and A (t) denotes the signal envelope function.

A. POWER FREQUENCY MODULATION BIONIC (PFMB) SIGNAL MODEL
The first method of constructing the bionic signal model is based on the PFM signal model. A PFM signal [34] with duration T is defined as where B is the bandwidth, f C plays the part of the carrier frequency, and α (α > 0) is a curvature adjustment factor. The instantaneous frequency of the PFM signal is expressed asf Furthermore, the contour slope of the PFM signal is Obviously,f P (t) is a power function of time t. The start frequencyf P (0) and the end frequencyf P (T ) are f C and f C + B, respectively. Clearly, the contour curvature and contour slope of the SFM signal depends on its frequency range, duration and parameter α. However, when trying to fit the contour of an upsweep tonal sound using a PFM signal with α > 1, we find that there is a significant mismatch between them, as shown in Fig. 2. This mismatch is caused by the different contour slope between the upsweep tonal sound and the PFM signal at time t = 0. As can be seen form (5), when α > 1, the contour slope of the PFM signal at time t = 0 is alwaysf P (0) = 0, no matter how its frequency range, duration and parameter α change.
For a close match between the bionic signals and the tonal sounds, based on (4), the instantaneous frequency of a novel PFMB signal model is proposed as follows where the curvature adjustment factor α (α > 0) is used to adjust the curvature of f P (t), and the slope adjustment factor k (0 ≤ k ≤ B/T ) is used to adjust the slope of f P (t) at time t = 0 and t = T . Clearly, f P (t) continuously and monotonically goes from the start frequency f P (0) = f C to the end frequency f P (T ) = f C + B within a signal duration in the way of PFM. By substituting (6) into (1) and (2), we can obtain the corresponding PFMB signal model as follows As can be seen from (6) and (7), the PFMB signal model is more general than the PFM model. More specifically, when k = 0, s P (t) is equivalent to the PFM signal model; when k = B/T , f P (t) = Bt/T + f C and s P (t) is equivalent to a LFM signal model. Specially, when B = 0, f P (t) = f C and s P (t) is equivalent to a CW signal model.
The contour slope of s P (t) is defined as As can be seen form (6) and (8), due to the introduction of the slope adjustment factor k, the contour slope of s P (t) now depends on its frequency range, duration, curvature adjustment factor α and slope adjustment factor k. When the frequency range and duration of s P (t) are fixed, the contour curvature can be adjusted by changing the parameter α, and the contour slope at time t = 0 and t = T can be adjusted by changing the parameter k, which is very important to mimic the contours of true tonal sounds as closely as possible. For example, when carrier frequency f C = 6kHz, bandwidth B = 4kHz, duration T = 0.4s, and half of the frequency range is f 1/2 = f C + B/2 = 8kHz, the contours of s P (t) with different curvature adjustment factors α and slope adjustment factors k are shown in Fig. 3(a) and Fig. 3(b), respectively. When the parameter k is constant (k = 0), the contours of s P (t) with three different parameters α are shown in Fig. 3(a). Besides, when the parameter α is constant (α = 2), the contours of s P (t) with three different parameters k are shown in Fig. 3 As shown in Fig. 3(a), with the change of α, the contour curvature of s P (t) changes, and the value of α also affects the monotonicity of the contour slope f P (t) and the varying speed of f P (t). When 0 < α < 1, the contour slope f P (t) decreases monotonically with time t, and therefore, f P (t) changes faster in the frequency range below f 1/2 and f P (T /2) > f 1/2 ; for α = 1, f P (t) is always B/T , which means s P (t) is equivalent to a LFM signal and f P (T /2) is exactly f 1/2 ; when α > 1, f P (t) increases monotonically with t, and therefore, f P (t) changes faster in the frequency range above f 1/2 and f P (T /2) < f 1/2 .
It can be seen from Fig. 3(b) that as k increases from 0 to B/T = 10, the contour slope of s P (t) at time t = 0 and t = T also changes continuously and gradually approaches B/T . Therefore, one can mimic the contours of the true tonal sounds as closely as possible by adjusting α and k on the condition that the frequency range and the duration are fixed. VOLUME 8, 2020

B. SINUSOIDAL FREQUENCY MODULATION BIONIC (SFMB) SIGNAL MODEL
The second method of constructing the bionic signal model is based on the SFM signal model. A SFM signal [35] with a duration T is defined as The instantaneous frequency of the SFM signal is Furthermore, the contour slope of the SFM signal is Obviously,f S (t) is a sinusoidal function of time t. The start frequencyf S (0) and the end frequencyf S (T ) are f C and f C + B, respectively. Clearly, the contour curvature and contour slope of the SFM signal only depends on its frequency range and duration. However, when trying to fit a part of the contour of a sine tonal sound using the SFM signal, we find that there is a significant mismatch between them. Fig. 4(a) shows the complete contour of a sine tonal sound, the contours of a SFM signal and part of the sine tonal sound are shown in Fig. 4(b). This mismatch is caused by the different contour curvatures between the tonal sound and the SFM signal. As can be seen from (10), once the frequency range and the duration of a SFM signal are fixed, neither its contour curvature nor its slope can be changed.
To have a close match to the true tonal sounds, based on (10), and referring to the method of changing the contour curvature and contour slope in the PFMB signal model, the instantaneous frequency of a novel SFMB signal model is proposed as follows where the curvature adjustment factor β (β > 0) is used to adjust the curvature of f S (t), the slope adjustment factor k (0 ≤ k ≤ B/T ) is used to adjust the slope of f P (t) at time t = 0 and t = T . Clearly, f P (t) continuously and monotonically changes from the start frequency f P (0) = f C to the end frequency f P (T ) = f C +B within a signal duration as in SFM. By substituting (12) into (1) and (2), we can obtain the corresponding SFMB signal model as follows As can be observed from (12) and (13), the proposed SFMB signal model is more general compared to the SFM signal model. More specifically, when h = 0 and β = 2, Most importantly, the contour slope of s S (t) is As can be seen from (12) and (14), due to the addition of the curvature adjustment factor β and the slope adjustment factor h, the contour curvature and contour slope of s S (t) now depend on its frequency range, duration, parameter h and parameter β. When the frequency range and duration of s S (t) are fixed, the contour curvature can be adjusted by changing β, and the contour slope at time t = 0 and t = T can be adjusted by changing k, which is very important for us to mimic the contours of the true tonal sounds as closely as possible.  Fig. 5(a). Besides, when β is constant (β = 2), the contours of s P (t) with three different values of h are shown in Fig. 5(b).
It can be seen from Fig. 5(a) that with the change of β, the contour curvature of s P (t) also changes. Besides, the monotonicity of f S (t) and the varying speed of f P (t) are also affected by the value range of β; when 0 < β ≤ 1, f S (t) decreases monotonically as time t increases; when β > 1, f S (t) increases first and then decreases as time t increases. Furthermore, the value range of β also affects the varying speed of f P (t), when 0 < β < 2, f S (t) varies faster in the frequency range below f 1/2 and therefore f S (T /2) > f 1/2 ; for β = 2, f S (t) is symmetric about the center point T /2, f 1/2 and thus f S (T /2) = f 1/2 ; when β > 2, f P (t) varies slower in the frequency range below f 1/2 , so that f P (T /2) < f 1/2 .
As shown in Fig. 5(b), the contour slope of s S (t) at time t = 0 and t = T changes continuously and gradually approaches B/T as h increases from 0 to B/T = 10.
Therefore, one can mimic the contours of the true tonal sounds as closely as possible by adjusting β and h on the condition that the frequency range and the duration are fixed.
Moreover, by comparing Fig. 3(a) and Fig. 5(a), it can be observed that when α < 1 and β < 1, both contour slopes of the PFMB signal (f P (t)) and the SFMB signal (f S (t)) change monotonically. When k = 0 and h = 0, f P (t) with four different values of α are shown in Fig. 6(a), and f S (t) with four different values of β are shown in Fig. 6(b). It can be observed that f P (T ) is always greater than zero, whereas f S (T ) is always zero when t = T . From (8), it can be demonstrated that f P (T ) = α (B/T − k) + k, since α > 0 and 0 ≤ k ≤ B/T , no matter how α and k change, f P (T ) > 0 always holds, which means the contour slope of the PFMB signal s P (t) is always greater than 0 at time t = T . However, from (14), it can be demonstrated that f S (T ) = h, which means that the contour slope of the SFMB signal s S (t) is determined only by h, and f S (T ) = 0 always holds when h = 0.

C. SUB BIONIC SIGNAL MODELS
As analyzed above, PFMB and SFMB signal models can only be utilized to match tonal sounds with monotonically increasing frequency.
In order to increase the diversity of bionic signals based on the PFMB signal model and the SFMB signal model, we propose four corresponding sub-signal models.
Based on the TF expression of the PFMB signal model defined by (6), four sub-PFMB TF expressions f PO (t), f PX (t), f PY (t) and f PZ (t) are defined as follows Through (1) and (2), we can obtain four sub-PFMB signal models s PO (t), s PX (t), s PY (t) and s PZ (t). Obviously, f PO (t), f PX (t), f PY (t) and f PZ (t) are power functions of time t. When the frequency range is 6 to 10 kHz and the duration T = 0.4s, the curves of these TF expressions with three different α parameters are shown in Fig. 7(a)-(d). It can be seen that f PO (t) is identical to the TF expression of the original PFMB signal model f P (t), and f PO (t) is symmetric with f PX (t), f PY (t) and f PZ (t) about the axis f PO (t) = B/2 + f C , axis t = T /2 and center point (T /2, B/2 + f C ), respectively. Besides, f PO (t) and f PZ (t) changes from the start frequency f C to the end frequency B + f C within a signal duration T , whereas f PX (t) and f PY (t) goes from the start frequency B + f C to the end frequency f C within the same duration T .
In a similar way, based on the TF expression of the SFMB signal model defined by (12), four sub-SFMB TF expressions f SO (t), f SX (t), f SY (t) and f SZ (t) are defined as follows Through (1) and (2), we can obtain four sub-SFMB signal models s SO (t), s SX (t), s SY (t), s SZ (t). Obviously, f SO (t), f SX (t), f SY (t) and f SZ (t) are sinusoidal functions of time t, and f SO (t) is identical to the TF expression of the original SFMB signal model f S (t). When the frequency range is 6 to 10 kHz and the duration T = 0.4s, the curves of f SO (t) , f SY (t), f SX (t) and f SZ (t) with four different β parameters are shown in Fig. 8(a)-(d), respectively. It can be seen that f SO is symmetric with f SX (t), f SY (t) and f SZ (t) about the axis f SO (t) = B/2 + f C , axis t = T /2 and center point (T /2, B/2 + f C ), respectively. Besides, for f SO (t) and f SZ (t), the start frequency and end frequency are f C and B+f C , respectively, whereas those of f SX (t) and f SY (t) are B + f C and f C , respectively.

D. PIECEWISE CONSTRUCTION STRATEGY FOR COMPLEX TONAL SOUNDS
For a tonal sound with a simple contour, a sub-PFMB signal or a sub-SFMB signal is sufficient for accurate mimicry. However, in reality, the contours of most tonal sounds are complex, and as can be observed from Fig. 4, a bionic signal can only match a part of a tonal sound. In this case, a piecewise construction strategy for complex tonal sounds is proposed.
Suppose that the bionic signal corresponding to a tonal sound is expressed as s B (t). To construct s B (t), the tonal sound is first divided into M segments. In order to reduce the complexity of constructing bionic signals, under the condition of ensuring similarity between the bionic signal contour and the tonal sound contour, M should be as small as possible. Then, M bionic signal segments are constructed to mimic M segments of the tonal sound. Finally, by putting these M bionic signal segments together in the time domain, we can obtain s B (t). Fig. 9 shows the piecewise construction strategy for the complex tonal sounds. As shown in Fig. 9, the bionic signal s B (t) consisting of M bionic signal segments is expressed as where A B (t) is the envelope of s B (t), s N (t) is the normalized bionic signal, which is s B (t) before amplitude modulation and expressed as In general, the time delay for the first bionic signal segment More specifically, the piecewise construction strategy can be divided into the following four steps: Step 1: Design the TF expression of each bionic signal segment s B,m (t).
For each bionic signal segment, choosing sub-PFMB signal models or sub-SFMB models is based on the contour slope of each tonal sound segment. If the absolute value of the contour slope changes monotonically, both sub-PFMB signal models and sub-SFMB models are suitable. However, when the absolute value of the contour slope firstly increases and then decreases, only sub-SFMB signal models are appropriate.
Furthermore, the TF expression of each bionic signal segment is designed by changing the parameters of the bionic signal model, including curvature adjustment factor α (or β), slope adjustment factor k (or h), duration T , carrier frequency f C and bandwidth B.
In order to ensure a high similarity between the bionic signal and the original tonal sound, the smoothness of the contour of the bionic signal should be consistent with that the tonal sound; if the contour of the original tonal changes continuously, the frequency and contour slope should change continuously where two adjacent bionic signal segments are connected. For example, if a bionic signal consists of two bionic signal segments s B,1 (t) (0 ≤ t ≤ T B,1 ) and s B,2 (t) (0 ≤ t ≤ T B,2 ), and their TF expressions are f B,1 (t) and f B,2 (t), respectively, to ensure a smooth transition between f B,1 (t) and f B,1 (t), the following two conditions should be satisfied: For example, as shown in Fig. 10, the contour of the sine tonal sound in Fig. 4 can be mimicked by three bionic signal segments s B,1 (t), s B,2 (t) and s B,3 (t). The contours of these three bionic signal segments are expressed as f B,1 (t), f B,2 (t), and f B,3 (t), respectively, and they are constructed based on f SO (t), f SY (t), and f SO (t), respectively.
Step2: Construct the normalized bionic signal s N (t). The mth bionic signal segment of the bionic signal s B (t) is defined as where φ B,m (t) is the phase function, and f B,m (t) is the TF expression of a sub-PFMB signal or a sub-SFMB signal.
Then, each bionic signal segment is shifted along the time axis, and the mth bionic signal segment s B,m (t) is shifted for T D,m and is expressed as Substituting M shifted bionic signal segments s B,m t − T D,m into (24), we can obtain the waveform of s N (t).
It is noteworthy that φ B,m is the phase compensation component, which is added to avoid the abrupt phase change between two adjacent bionic signal segments, and φ B,m is defined as Obviously, the phase compensation for the first bionic signal segment s B,1 (t) is φ B,1 = 0. Therefore, by transforming the TF expressions f B,1 (t), f B,2 (t), and f B,3 (t) in Fig. (8) into waveforms in the time domain, we can obtain the normalized bionic signal s N (t). The waveform and TFS of s N (t) are shown in Fig. 11 (a) and (b), respectively. Step 3: Construct the envelope A B (t) of the bionic signal s B (t).
So far the signal envelope A B (t) has not been considered yet. It can be seen from Fig. 1 that different from the conventional ASD and UAC signal waveforms (such as CW, LFM, and HFM), the envelopes of the tonal sounds are not rectangular, and varies with different irregularity for different tonal sounds. Therefore, the envelope of each bionic signal s B (t) should be fit to that of the corresponding tonal sound. The envelope extraction method used here is based on the one proposed in [16]. VOLUME 8, 2020 Firstly, the STFT with a N-point Hamming window of the denoised tonal sounds is calculated. The denoised tonal sound is a discrete-time signal, expressed as where a [n] and φ [n] are the envelope and phase of x [n], respectively. The discrete STFT for x [n] can be expressed as X [k, l], where k is the block number and l is the frequency bin index. X k [l] is the discrete Fourier transform (DFT) for the kth block. Secondly, obtain the envelope a [n] from X k [l]. If τ k is the starting point of the kth block, then the amplitude of x [n] at τ k is a [τ k ] = A k . The peak value of X k [l] is expressed as P k = max |X k (l)|. Since P k is modulated by the Hamming window and DFT, a amplitude recovery factor is obtain by Finally, let a [τ k ] = KP k and the amplitude of a [τ k ] is restored to the same level as the tonal sound envelope. By using the piecewise cubic Hermit interpolation to add the remaining points of a [n], the extracted envelope of x [n] is obtained. The extracted envelope a [n] and the waveform of the sine tonal sound are shown in Fig. 12, where it can be seen that a [n] matches the envelope of the tonal sound well.
Step 4: Substituting the envelope A B (t) and the normalized bionic signal s N (t) into (23), we can obtain the bionic signal s B (t). The sine tonal sound and its TFS and the bionic signal s B (t) and its TFS are shown in Fig. 13. It can be seen that the bionic sonar signal can match the true whale whistles very well in terms of not only envelopes but also contours.
Besides, if a tonal sound has R harmonics, its corresponding bionic signal can be expressed as where s r (t) is the rth harmonic of s B (t), constructed according to the four steps above. Furthermore, the frequency of the rth harmonic f r (t) is designed to be integer multiples of the fundamental frequency f 1 (t), i.e.

A. SYNTHESIS OF SIX CATEGORIES OF TONAL SOUNDS
In this section, we examine the synthetic performance of the proposed method. Bionic signals corresponding to six categories of tonal sounds (constant frequency, upsweep, downsweep, concave, convex and sine) described in Table 1 are synthesized. Furthermore, one high-quality and representative of each category of tonal sounds are chosen to be matched and mimicked. The true tonal sounds and their TFSs and the constructed bionic signal waveforms and their TFSs are shown in Figs. 14-19. It can be seen that the constructed bionic signals have a close match to the true tonal sounds in terms of both envelopes and TFSs. The waveforms and TFSs of the constant frequency tonal sounds T_cf1 and the corresponding bionic signals B_cf1 are shown in Fig. 14(a)-(d). Besides, the waveforms and TFSs of the constant frequency tonal sounds T_cf2 and the corresponding bionic signals B_cf2 are shown in Fig. 14(e)-(h).
The waveforms and TFSs of the upsweep tonal sounds T_up1 and the corresponding bionic signals B_up1 are shown in Fig. 15(a)-(d), and those of the upsweep tonal sounds T_up2 and the corresponding bionic signals B_up2 are shown in Fig. 15(e)-(h).
The waveforms and TFSs of the downsweep tonal sounds T_down1 and the corresponding bionic signals B_down1 are shown in Fig. 16(a)-(d), and those of the downsweep tonal sounds T_down2 and the corresponding bionic signals B_down2 are shown in Fig. 16(e)-(h).
The waveforms and TFSs of the concave tonal sounds T_concave1 and the corresponding bionic signals B_concave1 are shown in Fig. 17(a)-(d), and for the concave tonal sounds T_concave2 and the corresponding bionic signals B_concave2, they are shown in Fig. 17(e)-(h).
For the convex tonal sounds T_convex1 and T_convex2, and their corresponding bionic signals B_convex1 and B_convex2, the results are shown in Fig. 18(a)-(h), while for T_sine1 and T_sine2, and the corresponding bionic signals B_sine1 and B_sine2, they are shown in Fig. 19(a)-(h).

B. SYNTHESIS OF COMPLEX CETACEAN SOUNDS
In the second experiment, we examine the performance of the proposed method for synthesizing complex cetacean sounds.
In addition to tonal sounds, cetaceans can also produce some FM sounds with relatively long duration and complex TF structures, such as signature whistles [36]. Since these complex cetacean sounds can be divided into multiple simple FM signals, the proposed bionic signal models and the piecewise construction strategy can also be utilized to synthesize complex cetacean sounds.
An example is shown in Fig. 20, where 21 bionic signal segments are constructed to achieve high-similarity mimicry. The waveforms and TFSs of the complex cetacean sounds C_complex and B_complex are shown in Fig. 20(a)-(d). It can be seen that the constructed bionic signals can match the complex cetacean sounds very well.

C. CAMOUFLAGE ABILITY EVALUATION
In the third experiment, the camouflage ability of the constructed bionic signals is examined.
Since the contours of tonal sounds have obvious FM characteristics, and present acoustic classifiers usually classify a tonal sound based on its contour [26]- [29], the camouflage ability of a synthesized bionic signal depends on the similarity between its contour and that of the true tonal sound. The Pearson correlation coefficient (PCC), which is widely used in the measurement of the similarity between two data sets [37], is used to measure the similarity between the contours of the true tonal sounds and the synthesized bionic signals.
The extracted contour of a true tonal sound is f T [n] = f T _1 , f T _2 , · · · , f T _n , and the contour of the corresponding bionic signal is  , that is, the better the camouflage ability of the bionic signal. Table 2 shows the PCC between 13 true cetacean sounds and their corresponding bionic signals (shown in Figs. [14][15][16][17][18][19][20]. It can be seen that all 13 PCC results are higher than 0.97, and 11 results of them are no less than 0.99, which means that the contours of the 13 true cetacean sounds and their corresponding bionic signals are highly similar. Therefore,  the camouflage ability of the synthesized bionic signals is very high.

D. MODULATION OF CETACEAN TONAL SOUNDS
In the fourth experiment, we examine the modification performance of the proposed method.
Here we take the upsweep tonal sound T_up1 (shown in Fig. 15(a)-(b)) as an example and construct its corresponding bionic signal B_up3. Under the condition of ensuring a good camouflage ability for B_up3, the parameters of the B_up3 are modified to improve its ASD and UAC performance. Furthermore, for radar, sonar and acoustic communication signal design, range resolution (RR), range sidelobe level (RSL), velocity resolution (VR), and Doppler tolerance (DT) are four key performance indicators, and they are obtained by the ambiguity function (AF) [38].
For ASD and UAC applications, when B/f o < 0.01, the signal considered to be narrowband [39], where B is the bandwidth and f o is the center frequency of the signal. Based on this criterion, it can be seen that the tonal sound T_up1 is wideband. Therefore, the wideband ambiguity function (WAF) [40]- [42] is used to examine the four key indicators above. The WAF can be defined as follows where η = (c − v) / (c + v) ≈ 1 − 2v/c is the Doppler scale factor, τ = 2R/c is the propagation time delay, R is the target range, ' * ' is the complex conjugate operator, and c is the sound speed in water. Besides, for ASD applications, v is the relative speed between the sonar system and the target, and for UAC applications, v is the relative speed between the acoustic signal transmitter and the receiver.
The TF expression f PO (t) shown in (15) is chosen to construct B_up3. Therefore, the contour of B_up3 can be adjusted by modifying five parameters, which are carrier frequency f C , bandwidth B, duration T , curvature adjustment factor α and slope adjustment factor k. By modifying these parameters, we found that increasing the value of B can improve the RR, RSL and VR of B_up3, while parameters α and k are related to the DT of B_up3. The parameter value of the tonal sound T_up1 and the bionic signal B_up3 are given in Table 3, where it can be seen that the frequency range of B_up3 (6.8 kHz to 10.3 kHz) is slightly expanded by 1 kHz based on that of T_up1 (6.8 kHz to 9.3 kHz). Moreover, the duration of B_up3 is slightly increased by 3ms based on that of T_up1.  The waveform and TFS of B_up3 are shown in Fig. 21(a)-(b). By comparing Fig. 21(a)-(b) and Fig. 15(a)-(b), it can be seen that B_up3 can match T_up1 very well. Based on (35), we can obtain the PCC between B_up3 and T_up1 r TB = 0.9985, which means that the contours of T_up1 and B_up3 are highly similar. Therefore, the camouflage ability of the synthesized bionic signal B_up3 is very high.
The RR, RSL, VR, and DT of T_up1 and B_up3 are shown in Table 4, where the change of RR is (0.2716 − 0.5100)/ 0.5100 × 100% ≈ −46.75%. Similarly, the changes of RSL, VR and DT are -40.31%, -61.18% and 1800%, respectively. It can be seen that these four performance indicators of the bionic signal B_up3 are significantly improved compared to those of the tonal sound T_up1. For visualization, the WAF diagrams of the two signals are shown in Fig. 22. It is well-known that, for ASD applications, a higher RR and a lower RSL lead to high-accuracy range measurement, a higher VR indicates a high-accuracy speed measurement, and a larger DT allows range measurement for high-velocity targets. On the other hand, for UAC applications, RR and RSL correspond to time resolution (TR) and time sidelobe level (TSL), and a higher TR and a lower TSL results indicate a high-accuracy time measurement, while a higher DT allows effective UAC when the acoustic transmitter (or receiver) moves at a high speed. Based on the above analysis and results, it can be seen that by slightly modifying the parameters of the synthesized bionic signals, we can obtain bionic signals with both high camouflage ability and good detection and communication performance.

V. CONCLUSION
In this paper, based on the analysis of the acoustic characteristics of tonal sounds, two bionic signal models and their corresponding sub-models have been developed to match various TF structures of cetacean tonal sounds. Associated with the proposed bionic signal models, a piecewise construction strategy was developed to realize high-similarity mimicry of most tonal sounds and some complex cetacean sounds, such as signature whistle. The bionic signal models and their corresponding sub-models have exact and closed-form mathematical expressions, and together with the effective piecewise construction strategy, they provide the following benefits, as demonstrated by extensive design examples: (1) The synthesized bionic signal waveforms are very close to the true cetacean tonal sounds, which can be used for the construction of high camouflage bionic signal waveforms.
(2) Most cetacean tonal sounds and some cetacean sounds (even though it has a complex TF structure) can be imitated with high similarity. (3) The parameters (time domain envelope, frequency distribution and TF shape) of the synthesized bionic signals can be conveniently adjusted, which is very beneficial for the following two aspects: 1) generate other similar bionic signals, 2) change the detection and communication performance of bionic signals by felicitously adjusting the parameters of signal models.
Compared with the conventional methods [12]- [16], [24], [25], the proposed one cannot only achieve high camouflage ability, but also obtain high detection and communication performance for covert ASD and UAC through proper parameter adjustments. Moreover, the proposed method can also be employed for the construction of cetacean sound database and behaviour research of cetaceans.