Audio Encryption Scheme Using Self-Adaptive Bit Scrambling and Two Multi Chaotic-Based Dynamic DNA Computations

In this paper, a novel scheme for secure audio transmission is introduced. The novelty of this scheme is that it combines four different techniques for audio encryption in the same scheme which makes it more secure: self-adaptive scrambling, multi chaotic maps, dynamic DNA encoding and cipher feedback encryption. Also, it introduces two new designed multi chaotic maps as pseudo random generators that combine five different chaotic maps with eight control parameters. The scheme consists of three phases with three secret keys. The first phase is a self-adaptive bit scrambling where SHA512 of the input audio is computed to be used as a first secret key for cyclic shifting the input audio binary stream which efficiently reduces the strong correlation between neighboring audio samples. The second phase is a dynamic DNA encoding of the scrambled audio by using a second secret key obtained from a pseudo random generator (LCS) using a novel design of multi chaotic maps including Sine, Chebyshev and Logistic map with three control parameters. As DNA encoding provides fast computations and large capacity for data transmission, the third phase consists of two DNA algebraic operations “AND” and “XOR” using a pseudorandom DNA sequence as a third secret key generated by another new designed multi chaotic HLG map which combines Henon, Logistic and Gaussian chaotic maps and has five control parameters. The third phase is structured in a cipher feedback mode and so it achieves strong diffusion and confusion in the encrypted audio. Also combining five different chaotic maps in the proposed scheme increases the number of secret control parameters which achieves a huge key space and so robust strength against brute force attacks. The scheme is evaluated by using different measurements including signal to noise ratio (SNR), peak signal to noise ratio (PSNR), number of samples change rate (NSCR), unified average changing intensity (UACI), root mean square (RMS), crest factor (CF), correlation coefficient, histogram, key sensitivity and key space. From results, it is clear that the scheme is highly secure and stronger than many recent similar audio encryption schemes against different types of attacks.


I. INTRODUCTION
Multimedia data security is a big challenge for the modern open communication systems which are threatened by different types of attacks. The main tool for achieving this security is the encryption. Audio files usually have larger size compared to the other multimedia files as images or texts. So the conventional encryption schemes as DES [1], AES [2] and RSA [3] are not suitable for audio data encryption as it will require long computational time, high power consumption The associate editor coordinating the review of this manuscript and approving it for publication was Mamoun Alazab . and so can't provide real time communication. So introducing new audio encryption schemes with high security and high speed has become the main concern of the researchers working in multimedia security area [4]- [9]. In [10], a perception based method for partially encrypting telephonic speech has been introduced. In [11], [12], two selective audio encryption schemes have been proposed. In [13], the authors proposed a progressive audio scrambling with compression algorithm. Most of the modern encryption schemes are based on two main principles; confusion and diffusion [14]. Confusion means that the relation between the ciphertext and the key is as complex as possible, while diffusion means that the plaintext statistics are completely dissipated in the ciphertext statistics. Chaos theory has been discovered as a source of confusion and diffusion in cryptography [15].
Chaotic map has a dynamic behavior and is very sensitive to control parameter and initial conditions. This means that a tiny change in the initial conditions produces a remarkable change in the output. This matter makes the system more random. Chaotic maps have been used for many image encryption schemes as in [16]- [23]. In [24], a Baker chaotic map based speech segment permutation has been used for audio encryption.
The authors in [25] used 2-D cellular automata for audio scrambling and encryption.
The same matter for audio encryption. In [26], the authors introduced a chaotic encryption scheme for speech signals in transform domains. A voice encryption scheme based on off-line ica has been proposed in [27]. An audio shuffling encryption scheme has been introduced in [28]. In [29], compound chaotic mapping has been used for voice encryption scheme. An audio encryption scheme with single and double dimension discrete time chaotic systems has been proposed in [30]. Compressive sensing and Arnold transform has been used for audio encryption in [31].
The authors in [32] have proposed random unified chaotic maps as PRBG and used it for voice encryption in wireless communication. In [33] an audio encryption scheme based on the cosine number transform has been introduced. An audio encryption scheme using confusion and diffusion based on multi-scroll chaotic system and one time key has been presented in [34]. In [35], the authors proposed an approach for speech encryption using Zaslavsky map for pseudo random numbers generation.
A mixture of chaos functions has been used for audio encryption in [36]. A novel approach based on stream cipher for selective speech encryption has been introduced in [37]. A chaotic shift keying using multiple permutations has been introduced in [38]. In [39], an audio encryption scheme using delayed uncertainty of hybrid bidirectional associative memory and fuzzy neural networks has been proposed.
In [40] an audio encryption algorithm using permutationsubstitution architecture with chaotic circle map and modified rotation equations has been proposed. An audio encryption scheme based on fast Walsh Hadamard Transform and mixed chaotic keystreams has been introduced in [41].
DNA cryptography is a recent security tool which has many advantages [42]. Among these advantages is large parallelism which provides faster computations. Also, it has large capacity for data storage and transmission by DNA molecules and it has a good power saving. This matter makes DNA cryptography better compared to conventional schemes as DES and AES. DNA computations have been firstly used for encryption in [43], [44] in 1994. Combining both DNA cryptography and chaotic maps will produce a very secure cryptosystem. This matter has been done in many image encryption schemes as in [45]- [47]. For audio encryption, a cryptosystem using chaotic maps and DNA encoding has been proposed in [48]. A novel fast and secure approach for voice encryption based on DNA computing has been introduced in [49].
Recently in [50], an encryption scheme for audio encryption using DNA encoding with Logistic map and channel shuffling has been introduced. In this scheme, the key generation is completely independent of the input audio blocks and the encryption of each block is independent of the previously encrypted blocks which may cause week confusion and diffusion and hence make the scheme not robust enough against chosen ciphertext and plaintext attacks.
This paper proposes an improved audio encryption scheme consists of three phases with three secret keys and has the following advantages: (a) Scrambling is a good tool to hide the plaintext statistics and especially, bit scrambling achieves confusion and diffusion that audio samples scrambling cannot achieve and destroys the correlation between the adjacent samples. So the first phase of the proposed scheme is a scrambling of the input audio binary bit stream by using the hash function SHA512 of the input audio file as a first secret key. With this input audio key-dependency, the scheme is more random and stronger against statistical and differential attacks.
(b) In the second phase, a pseudo random generator (LCS) combines multi chaotic maps: Sine, Chebyshev and Logistic map is used for generating a pseudo random sequence as a second secret key for dynamic DNA encoding of the scrambled audio.
Using dynamic DNA encoding to change the audio samples and spread the original input audio effect into the cipher audio instead of using fixed DNA encoding achieves security enhancement as eight encoding rules are used instead of only one rule. DNA encoding provides faster computations and larger capacity for information storage and transmission.
(c) A second pseudo random generator which combines multi chaotic maps: Henon, Logistic and Gaussian chaotic maps as a new combination is used for generating a pseudo random binary sequence that is DNA encoded to be used as a third secret key with two DNA algebraic operations ''AND'' and ''XOR'' in the third phase of the proposed scheme.
The structure of this phase is in a cipher feedback mode that the current cipher audio file depends not only on the original input audio file and the key but also on the previous cipher audio. This matter spreads quickly any tiny change in the original input audio into the whole cipher audio and so achieves strong diffusion and confusion in the encrypted audio. The combination of a five different chaotic maps in the proposed scheme makes it more random and has a larger key space than a simple single chaotic system.
The remainder of the paper is organized as follows: Section II introduces the preliminary studies. Section III introduces the proposed scheme. In Section IV, the simulation results are presented. Section V presents the security analysis; Section VI gives comparative study of the proposed scheme. Conclusions are introduced in Section VII.

A. CHAOTIC MAP
Chaotic maps behavior changes rapidly with any slight change in the control parameters.
The main idea in using chaotic maps for encryption is using its control parameters and initial conditions as secret keys. The chaotic maps are subdivided into one-dimensional and high dimensional chaotic maps. The one-dimensional maps have little parameters as Sine Map (SM), Logistic Map (LM), Chebyshev Map (CM), and Gaussian Map (GM) are defined as following as: Gaussian Map(GM ) : Henon Map(HM ) : The proposed LCS & HLG will be discussed:

1) FIRST PROPOSED MULTI CHAOTIC MAP (LCS)
The first proposed multi chaotic map is LCS which merges Sine, Chebyshev and Logistic map and has three control parameters as shown in Fig. The Mathematical function of the proposed LCS is described as following: where r, u are the chaotic system parameter and LCS(0) is initial value.

2) SECOND PROPOSED MULTI CHAOTIC MAP (HLG)
A second new designed multi chaotic map HLG which combines Henon, Logistic and Gaussian chaotic maps and shown in Fig. 2 is used for generating a pseudo random binary sequence used in the third phase as a key. The Mathematical function of the proposed HLG is described as following: where a, b and β are the chaotic system parameter and HLG(0) and HLG(1) are the initial values.

3) STATISTICAL TESTS AND CHAOTIC BEHAVIOR OF THE PROPOSED MULTI CHAOTIC MAPS
The random behavior of the two proposed multi chaotic maps LCS and HLG is tested by the NIST test suit which includes 16 statistical tests. These tests measure the randomness of the generated sequence. The test depends on the probability value (p-value). A comparison between the p -value and a significance level α = 0.01; which is the threshold between rejection and non-rejection region; is done. If p-value < 0.01, the sequence is not random and rejected while if p-value > 0.01, the sequence is random and accepted. 10 6 bit binary sequences generated from the two proposed LCS and HLG are tested by SP800-22 with 10 3 iterations. Table 1 lists the results. All the obtained p-values are > 0.01 and the generated sequence pass all SP800-22 tests so the generated sequences from the two proposed multi chaotic LCS and HLG have a good randomness. The Lyapunov Exponent (LE) is another tool for evaluating the performance of chaotic map which measures how the chaotic map is sensitive to the initial values. It can be thought of as the average logarithmic rate of separation or convergence of two nearby points of two time series A t and

B t separated by an initial distance
Stable system has negative LE, while positive values refer to exponential divergence from the initial value. As the maximum LE increases, the chaotic map random behavior increases. The LE of the proposed LCS, Logistic, Chebyshev and Sine chaotic maps are shown in Fig. 3. While the LE of the proposed HLG, Logistic and Gaussian chaotic maps are shown in Fig.4. It is obvious that the LE of LCS and HLG is positive for all values of u ∈ (1, 4), while the other chaos have negative or less positive values and so not chaotic for some  Table 2. The permutation and VOLUME 8, 2020  combinations of bases can be used for storing and calculation of information and hence its encryption. IF we consider the encoding rule 1 then the XOR, addition and subtraction for DNA sequences are shown in Tables 3, Table 4 and Table 5 respectively.

III. PROPOSED SCHEME
The proposed scheme audio encryption consists of three phases that will be discussed in the following section. The block diagram of the proposed scheme is shown in Fig. 5.

A. THE FIRST PHASE (SELF-ADAPTIVE BIT SCRAMBLING)
The first phase is a self-adaptive scrambling as it uses a key which is a function of the input audio itself. Scrambling means changing the bit position by cyclic right shifting the input audio binary bit stream as following: 1-The input audio is sampled and the hash function SHA-512 of the sampled audio is computed where (9) 2-The sampled audio is converted into binary vector,  3. The sequence X is converted into a DNA sequence U ={U 1 ,U 2 , . . . ..,U N 2 } of four bases A, G, T and C using one of the eight DNA encoding rules listed in Table 2 depending on the sequence w, where: The DNA encoding rule selected for encoding X i = w i and 1 ≤i≤ 4. The sequence S={S 1 ,S 2 , . . . ..,S N 2 } is algebraically manipulated with the sequence U ={U 1 ,U 2 , . . . ..,U N 2 } in a cipher feedback structure as following: where ⊕ and + are ''XOR'' and ''AND'' DNA operations respectively and C = {C 1 , C 2 , . . . . . . C N 2 } with 1 ≤ i ≤ N 2 5. C is dynamically decoded into binary sequence using the chaotic sequence w and the encoding rules listed in Table 2 as in Eqn. (11) to produce the encrypted audio. VOLUME 8, 2020 At the receiver and upon receiving the encrypted audio the encrypted image as following: 1. Firstly, the receiver has to share three secret key with the sender to be able to decrypt the encrypted audio. The three secret keys are: -the hash of the input audio H .
-the three control parameters of the multi chaotic generator LCS :r, u and LCS(0). -the five control parameters of the multi chaotic HLG generator: a, b, β, HLG (0) and HLG (1). 2. Secondly, the receiver uses the five control parameters of the multi chaotic HLG generator: a, b, β, HLG (0) and HLG (1) to generate the same chaotic sequence Z as in Eqn. (7) then convert it to the sequence L as in Eqn. (12), and uses the three control parameters of the multi chaotic generator LCS: r, u and LCS(0) to generate the chaotic sequence w as in Eqn. (10). 3. Both the received encrypted audio and the chaotic sequence L are DNA encoded, using one of the eight DNA encoding rules listed in Table 2 and the chaotic sequence w as in Eqn. (11), to the DNA sequences C and U respectively which are algebraically manipulated in a cipher feedback structure as following: 4. Thirdly, the DNA sequence S is DNA decoded into binary bit stream using the chaotic sequence w. Then this binary bit stream of length N is reshaped into two dimensions matrix M ' (512, P). Then the rows of M ' are cyclic left shifted using the bits h i of the hash H is done as following: Finally, the shifted binary bit stream in converted back to the original sampled audio.

IV. SIMULATION RESULTS
The laptop used is Intel(R) Core(TM) i5-6200UCPU@ 2.30GHz, 4GB RAM, Windows 10 (64-bit), Mathematica version 11. The proposed scheme is applied to different audios with different sizes and several experiments were done to measure the performance of the proposed algorithm. The results are shown in Fig.6. It is clear that the scheme converts the original audio into a nearly noise-like encrypted audio.

V. SECURITY ANALYSIS
A robust multimedia encryption scheme has to be strong enough against all types of attacks [51]- [54]. Security evaluation of the proposed scheme will be considered using some common measurements include key space and key sensitivity analysis, histogram, spectrogram, correlation coefficient, NPCR, UACI, MSE, SNR, PSNR, RMS and CF value.

A. KEY SPACE AND SENSITIVITY ANALYSIS
Key space is the set of all the keys used for audio encryption. It can be determined using two tests: the number of keys and the key sensitivity.

1) THE NUMBER OF KEYS
A good audio encryption scheme has to have a big key space to be strong enough against brute force attack.
The proposed scheme has three secret keys as previously mentioned. The three secret keys are: -The hash SHA-512 of the input audio H .
-The three control parameters of the multi chaotic generator LCS: r, u and LCS(0). -The five control parameters of the multi chaotic HLG generator: a, b, β, HLG (0) and HLG (1). If the computation precision is around 2 52 as in [55] then the key space is 2 512 Table 6 lists a comparison with other schemes. It is obvious that, the proposed scheme has a much larger key space and hence more strength against brute force attacks.

2) KEY SENSITIVITY
A robust audio encryption scheme should be very sensitive to any tiny change in its keys. This means that a tiny change in the key value has to provide a totally different encrypted audio. In the proposed scheme, the correct keys are used for encrypting ''Audio-1'' file in Fig. 7 (a) to obtain the encrypted audio shown in Fig. 7 (b). Then slight changes in the keys values are made as shown in Table 7. The new encrypted audios are displayed in Fig. 7(c)-(j). The correlation and the number of signal change rate between the encrypted audios [ Fig.7 (c)-(j)] and the encrypted audio Fig.7 (b) are listed in Table 7. These values show that there is no statistical similarity between the encrypted audio with the original keys and the encrypted audios with a slight change in the keys values. This matter proves that the proposed scheme is very sensitive to the secret keys   and so it is robust against the brute force and statistical attacks.

B. HISTOGRAM ANALYSIS
Histogram analysis is an accurate metric for measuring the quality of encrypted audio signal. A good encryption scheme has to encrypt the original audio file into a random like noise with sample values distribution nearly flat. So it can face statistical attacks. Histogram of the original tested audio files is shown in Fig. 6 [(c), (i), (o), (u), (aa), (gg)] while histogram of the corresponding encrypted audio is shown in Fig. 6[(d), (j), (p), (v), (z), (hh)].

C. SPECTROGRAM ANALYSIS
A spectrogram is a visual representation of an audio file frequency spectrum varying with time. It is obtained with Fourier transform. It is represented in two dimensions diagram: time and frequency. The audio samples are subdivided into chunks, and then Fourier transform is used to compute the frequency spectrum magnitude for each chunk.
The spectrogram of the original tested audio files is shown in Fig. 6 [(e), (k), (q), (w), (cc), (ii)], while the spectrogram of the corresponding encrypted audio files is shown on This proves that for the proposed scheme, there is no similarity between the original tested audio files and their encrypted versions.

D. CORRELATION ANALYSIS
Correlation analysis is a statistical metric to measure the strength of an encryption scheme against various type of statistical attack [56]. Generally, it measure the mutual relationship between similar segments in the original audio and the encrypted audio files. A robust encryption scheme has to converts the audio file into random-like noisy signal with low correlation coefficient. Correlation coefficient can be VOLUME 8, 2020 where N s is the total number of samples, x i and y i are the sample values of the original and encrypted audio files, E(x) and E(y) are means values of samples, σ x , σ y are the standard deviation of the encrypted audio file and original audio files, cov(x, y) is the covariance between both files. In Table 8, the size, duration and the correlation coefficient between the original audio files and the corresponding encrypted files are shown. The scattering plot diagram of the original audio file and the corresponding encrypted audio file is shown in Fig.8.
The results show that the correlation coefficient values are close to zero which means that no similarity between the two files and reflects the random ness of the encrypted audio file. This matter proves the high quality of the proposed encryption scheme.

E. UACI AND NSCR ANALYSIS
Differential attacks resistance is a main measure for an encryption scheme strength. This resistance is measured through the NSCR (number of samples change rate) and UACI (unified average changing intensity) tests [57].
To apply these tests, two audio segments different only by one sample are encrypted with the same key. Then the two encrypted audio segments are compared by NSCR and UACI which are calculated as following: where x i and x i represent the audio samples at I th position of the encrypted audio samples and Ns is the audio segments length. The theoretical value for NPCR is 99.61% and for UACI is 33.46%. An audio encryption scheme with values of NPCR and UACI higher above the theoretical values will be more robust and more secure. For the proposed scheme, a sample is randomly selected then is slightly modified to produce a modified audio. Then the encrypted audio of the original audio x and the encrypted audio of the modified audio x are used to compute NPCR and UACI for different audio files and are shown in Table 9. It is noted that the obtained values of the NPCR and UACI are higher than the theoretical values irrespective of the chosen sample position. This proves that the proposed scheme has better resistance against the differential attack.

F. SIGNAL TO NOISE RAT IO (SNR)
Signal to Noise Ratio (SNR) is used for measuring the signal quality [58], [59]. As the values be greater than 0 dB, as the signal be more than the noise. Given the original and the encrypted audio files, the SNR can be calculated as following: where x i and y i are the samples of the original and encrypted samples and N s is the number of samples. For the proposed scheme, the values of SNR of the tested audio files are listed in Table 8. The more negative the SNR is, the more powerful is the scheme. From the table, the proposed scheme has better negative SNR and so it is stronger against attacks.

G. PEAK SIGNAL TO NOISE RATIO (PSNR)
The mean squared error for two streams stored in vectors X and Y is computed as following: If X represents the original audio file and Y represents its encrypted audio, the PSNR can be computed as following: where MAX is the maximum value of the stream. The PSNR values for the encrypted tested audio files are computed and listed in Table 8. It is noted that the values are small. Lower values of PSNR is desired for encrypted audio files as it refers to high level of noise in the encrypted audio files and so strong resistance against attacks.

H. ROOT MEAN SQUARE (RMS) & CREST FACTOR (CF) VALUE
Root mean square (RMS) value is determined by the average amplitude level of an audio signal. It is equivalent to standard deviation when the input signal has zero mean and is computed as following: Crest factor (CF) is a waveform parameter which is determined as the ratio of the peak values to the effective value. It is  a measure of the extremeness of the peaks in a waveform and this is the minimum possible value. CF of ratio 0 dB indicates no peaks as DC current. Higher CF means peaks. It is given as following: For the proposed scheme, RMS and CF values for the tested values are computed and listed in Table 10.
As shown the values of RMS and CF of the encrypted audio files are close to 0.6 and 4.3 respectively. This proves that there no statistical relationship between the original audio files and the corresponding encrypted audio files.
I. COMPUTATIONAL TIME ANALYSIS Table 11 lists the computational speed of the proposed scheme for the eight tested audio files. It takes approximately between 0.19 -0.37 s to encrypt 1KB of data. This time increases as the audio file length increases.
Comparison with the Ref. [49] proves that the proposed scheme is faster and so it can be used for real time communication. Compression of audio file before encryption will more reduce this running time.

VI. COMPARATIVE STUDY
The capabilities of the proposed scheme compared to other schemes will be evaluated in this section. The most common security parameters including correlation, key space, SNR, PSNR, UACI, NSCR, RMS and CF will be used for performance comparison. Table 12 shows this comparison with other schemes. From this result, it is found that the proposed scheme has the lowest value of correlation coefficient between the original audio file and its corresponding encrypted one which reflects the lowest similarity VOLUME 8, 2020 between them. Also it clear that the proposed scheme has the largest key space among the other schemes with a value of 2 928 and so has the strongest resistance against brute-force attacks. The proposed scheme has the most negative value of SNR and the smallest value of PSNR among the other schemes. Also it has the largest values of UACI and NSCR and the smallest CF and so it is the strongest one against various differential attacks.

VII. CONCLUSION
The proposed scheme is secure scheme for audio transmission. The scheme consists of three phases with three secret keys and two new designed multi chaotic pseudo random generators. The first phase is a self-adaptive bit scrambling where SHA512 of the input audio is computed to be used as a first secret key for cyclic shifting the input audio binary stream which efficiently reduces the strong correlation between neighboring audio samples. The second phase is a dynamic DNA encoding of the scrambled audio by using a second secret key obtained from a pseudo random generator based on a new design of multi chaotic maps called LCS which merges Sine, Chebyshev and Logistic map and has three control parameters. The third phase consists of two DNA algebraic operations ''AND'' and ''XOR'' using a pseudorandom DNA sequence as a third secret key generated by another new designed multi chaotic HLG map which combines Henon, Logistic and Gaussian chaotic maps and has five control parameters. The third phase is structured in a cipher feedback mode and so it achieves strong diffusion and confusion in the encrypted audio. The proposed scheme has a huge key space and hence robust strength against brute force attacks. The scheme is evaluated by using different measurements including signal to noise ratio (SNR), peak signal to noise ratio (PSNR), correlation coefficient, histogram, key sensitivity, UACI, NSCR, RMS and CF. The results prove that the proposed scheme is highly secure and stronger than many recent similar audio encryption schemes against different types of attacks.