High Embedding Capacity Data Hiding Technique Based on EMSD and LSB Substitution Algorithms

Data hiding called steganography is a security technique to protect secret data throughout the transmission from malicious attackers. The purposes of steganography are to obtain good stego-image quality, high embedding-capacity, low computational complexity, visual imperceptibility, undetectability, and more security. In this paper, we offer a new hybrid image steganography technique based on least significant bit (LSB) substitution and enhanced modified signed digit (EMSD) algorithms. The proposed algorithm utilizes n adjacent cover image pixels to hide the secret data with EMSD algorithm, and least significant k-bit for LSB substitution algorithm. Hence, it has more embedding capacity than the EMSD algorithm and exploiting modification direction (EMD) based algorithms. We obtain that the stego-image quality is better than 43 dB when the payload is 2.404 bpp. The results of experiment represent that this algorithm ensures high embedding-capacity while preserving acceptable visual stego image quality that can be undetectable by human eyes. Also, the hybrid of the EMSD and LSB substitution algorithms is to difficult for malicious people to consolidate data by scrambling secret data bits.


I. INTRODUCTION
As a result of the rapid development of information and communication technologies in recent years, people have started to store large numbers of digital data via obtained using the camera, self-phone, computer. These data are also distributed and transferred more efficiently using the network, internet, and cloud technologies. Malicious people may obtain and alter these data during communication, so it is a critical issue to protect data security. In order to ensure the security of data, there are generally two techniques as cryptography and steganography. The cryptography [1] encrypts the secret data in such a way that it has incomprehensible data to malicious people. The secret data is scrambled using a secret key, and only people with the secret key can decrypt the original message. In cryptology, there are two categories: symmetric methods using public keys and asymmetric methods using the public for encryption and private key to decrypt. Triple Data Encryption algorithm (3DES), Advance Encryption Standard (AES), Data Encryption Standard (DES), and an The associate editor coordinating the review of this manuscript and approving it for publication was Noor Zaman . alternative encryption method Blowfish use the symmetric method to encrypt secret data. Also, The RSA (Rivest, Shamir & Adelman) algorithm, which uses an asymmetric encryption method, is one of the widely used encryption algorithms [2]. If the secret keys are obtained by malicious people, the secret data can be intercepted. Steganography is another popular technique that has been used to secure data in recent years. Steganography embeds secret data into text, audio, image, and video files so malicious people cannot notice the existence of secret data.
The data hiding technique as namely Steganography is an important method that dates back to many years. People used various materials such as words, images, trees, and furs over different time periods in order to protect their secret information from being understood by others. Nowadays, the steganography technique is utilized in many applications for instance medical, military, commercial, authentication, internet of things (IoT) based applications [3]- [7]. In these applications, the image and video files are generally preferred due to their high embedding-capacity as carrier media in order to hide data [8]- [10]. When the carrier media used to hide secret data is the image, it is called image VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ steganography. The image steganography comprises generally a cover-image, data hiding algorithm, secret data, and stego-image. While the cover-image expresses the carrier media which is utilized to hide information, the stego-image represents the result media that includes the secret data. The difference between cover and stego images is so small that the human eye cannot perceive it. Also, the secret data is generally applied to the encryption process with secret key before the embedding procedure to happen more securely. After the encryption process, the stego image is obtained by using data hiding algorithm [8], [10]. A good data hiding algorithm has to have high embedding capacity, good visual quality, imperceptibility and low complexity. However, in most cases, it is hard to have a data hiding method that includes these features at the same time. Therefore, researchers have improved on data hiding algorithms that require low computation time with acceptable visual quality at high embedding capacity in recent years. The least significant bits (LSB) [10]- [14], pixel value difference (PVD) [15]- [19], and exploiting modification direction (EMD) [20]- [24] algorithms are the most commonly used methods of image steganography in the spatial domain. The LSB substitution, PVD, and EMD algorithms are used together in the literature and it is observed good visual quality, high embedding capacity, and more secure [24]- [33]. A well-known data hiding technique is LSB substitution, in which uses the k-least significant bits of the cover image pixel values that are used in data embedding. While a grayscale cover image pixel hides one bit for the k = 1, colored cover image conceals three bits. In the LSB method, the embedding capacity increases as a result of increasing the k value, but the imperceptibility and stego image quality decrease. The PVD algorithm, which is another widely used technique, uses adjacent pixel pairs of the cover image when the secret data is concealed. The difference between these two adjacent pixel pairs determines the number of secret data bits to embed. Since the difference is usually tinny in adjacent pixel pairs, it is concealed average three bits of secret data, but adjacent pixel pairs where the difference is higher, more bits of data can be hidden. The EMD algorithm [34] is utilized to embed secret data for n adjacent pixels values in (2n+1)ary notational system. The only one in n adjacent pixels is modified to conceal the secret data. Therefore, the EMD based algorithms have usually low embedding capacity but very good image quality. There are various studies in the literature in order to increase the data capacity of the EMD algorithm [35]- [38].
While evaluating the data hiding algorithms used in image steganography, it is observed that embedding capacity or payload, peak signal-to-noise ratio (PSNR), and structural similarity index (SSIM) are used as measurement parameters in literature. The performance of data hiding algorithms is to obtain high and acceptable PSNR and SSIM values at high embedding capacity. It is also desired that the secret data in the stego image is imperceptible and resistant to attacks.
In this article, we present a high embedding capacity data hiding technique based on enhanced modified signed digit (EMSD) and LSB substitution algorithms. The purpose of this algorithm is to increase EMSD algorithms embedding capacity while the stego image is to keep the visual quality acceptable. As reported by the studies in the literature [23], [26] and [37], while the PSNR value above 40 dB is good stego image quality, the PSNR value above 30 dB is also acceptable. It means that the cover and stego images are considered visually indistinguishable.
We have also provided more security by using these two algorithms together. In the design of the algorithm, the ESMD algorithm is implemented first and new pixel values are obtained, then, LSB substitution is applied using the new pixel values. While the number of secret data bits in each pixel for LSB substitution is represented by k, the number of adjacent pixels for the used EMSD algorithm is symbolized by n. The contributions of this paper include the following: (1) A new data hiding algorithm is proposed based on EMSD and LSB substitution algorithm, (2) The proposed algorithm improves significantly the embedding capacity of the EMSD based data hiding algorithm, (3) In spite of the increased embedding capacity, there is an acceptable stego image quality, (4) A solution for the fall of boundary problem (FOBP) has been developed.
The remainder of this article is organized as follows. In Section II, related works are presented in detail, which contains the LSB substitution algorithm, EMD algorithm, GEMD algorithm, SMSD algorithm and EMSD algorithm. The proposed method, a high embedding capacity data hiding technique based on EMSD and LSB substitution algorithms, is given in section III. The experimental results and comparisons to evaluate the proposed algorithm is offered in section IV. The conclusion is finally presented in section V.

II. RELATED WORK
In this section, the LSB substitution, EMD, GEMD, SMSD and EMSD algorithms are introduced in detail. The LSB substitution algorithm is easy to use and rapidly, and owns high embedding capacity and good PSNR values. The EMD and GEMD data hiding algorithms have good stego image quality. The SMSD and EMSD algorithms, which have good stego image quality, utilize three symbols as 1, 0, 1, and weight minimization algorithm (WMA) to obtain the minimum weighted MSD representing an integer. But, The EMD, GEMD, SMSD and EMSD algorithm have the low embedding capacity and required some calculation for embedding and extracting procedure.

A. LSB SUBSTITUTION
The LSB substitution algorithm is the most common algorithm that hides secret data to the least significant bits of the cover image. It is easy to implement, fast and has high embedding capacity and good PSNR and SSIM values. As seen in literature studies, secret data is usually hidden in the first least significant bits of the cover image pixels, because the presence of hidden information cannot be detected by the human eye.
Since the LSB Substitution algorithm uses the least significant bits in the data hiding stage, the secret data is first converted to the binary number system. After, the least significant bits of the cover image pixels are changed with the secret data bits stream. The LSB Substitution algorithm has the embedding and extracting process shown in the following; Embedding Process Input: W x H size of cover image (C I ), Secret Data (SD), the least significant bits number (k) Output: W x H size of stego image (S I ) Step1: All of the SD data to be hidden is converted to binary Step2: Calculate the stego pixel value(P i ) using (1).
Step 3: Repeat step2 until all secret data is embedded into cover image. Finally, the stego image is obtained (S I ).
Extracting Process Input: W x H size of stego image (S I ), the least significant bits number (k) Output: Secret Data (SD) Step1: The least significant k bits value of each stego pixel are computed by (2).
Step2: Each d i value is converted into a k-bit binary value.
Step3: Repeat step2 and step3 until all secret data is extracted from stego image.

B. EMD ALGORITHM
The EMD algorithm is offered by Zhang and Wang in 2006 [34]. Their algorithm hides secret data which is in (2n + 1)-ary notational system into n adjacent pixels in images.
The algorithm replaces at most one pixel in the n adjacent pixels groups for rapid and effective data embedding. In these groups, at most, only one-pixel value is decreased or increased by 1. For instance, the 7-ary notational system data stream is hidden in three adjacent pixels. The EMD has good stego image quality but low embedding capacity. The payload of EMD algorithm is computed using (3).
The EMD algorithm has the embedding and extracting process shown in the following; Embedding Process Input: W x H size of cover image (C I ), adjacent pixels (n), Secret Data (SD), Output: W x H size of stego image (S I ) Step1: The n adjacent pixels p 1 , p 2, . . . . . . , p n are taken from the cover image (C I ).
Step4: Calculate the difference value (diff) between g EMD and SD i using (5).
Step6: Repeat step1 to 5 until all secret data is embedded into cover image. Finally, the stego image is obtained (S I ).
The GEMD algorithm has the embedding and extracting process shown in the following; Embedding Process Input: W x H size of cover image (C I ), adjacent pixels (n), Secret Data (SD), Output: W x H size of stego image (S I ) Step1: The n adjacent pixels p 1 , p 2, . . . . . . , p n are taken from the cover image (C I ).
Step4: Calculate the difference value (diff) between g GEMD and SD i using (10).
Step5: According to diff value, compute stego pixels values p 1, p 2, . . . ..p n, using the following situations; Step6: Repeat step1 to 5 until all secret data is embedded into cover image. Finally, the stego image is obtained (S I ). Extracting Process Input: W x H size of stego image (S I ), adjacent pixels (n), Output: Secret Data (SD) Step1: The n adjacent pixels p 1 , p 2 , . . . . . . . . . . . . , p n are taken from the stego image (S I ).

D. SMSD ALGORITHM
The sparse modified signed digit (SMSD) algorithm has been proposed to increase payload of EMD algorithm and to improve the stego image quality [36]. The SMSD algorithm uses three symbols as 1, 0, 1, and a weight minimization algorithm. An n-bit binary secret data for SMSD has two MSD specifications, which are the maximum n/2 number of non-zero bits, and no adjacent non-zero bits. The number of integers (T n ) that is represented by n-bit SMSD is calculated using (12). When n is even number, the binary secret data is between (-V n ) (1010.. if n is even, The SMSD algorithm has the embedding and extracting process shown in the following; Embedding Process Input: W x H size of cover image (C I ), adjacent pixels (n), Secret Data (SD), Output: W x H size of stego image (S I ) Step1: Calculate T n value using (12).
Step4: Calculate the difference value (diff) between g SMSD and SD i using (15).
Step7: Repeat step 2 to 6 until all secret data is embedded into cover image. Finally, the stego image is obtained (S I ).

Extracting Process
Input: W x H size of stego image (S I ), adjacent pixels (n), Output: Secret Data (SD) Step1: Compute T n value using (12).
Step3: Compute Secret Data (SD i ) using (17), Step4: Repeat step 2 and 3 until all secret data is extracted from stego image.

E. EMSD ALGORITHM
The enhanced MSD (EMSD) data hiding algorithm is proposed by Liu et al. in 2019 [38] to increase payload of SMSD algorithm. The EMSD algorithm uses a modified weight minimization algorithm (MWMA) for secret data EMSD binary representation. The nonzero bits located adjacent do not cause problems in MWMA. The number of integers (M n ) that can be represented by n-bit EMSD is calculated using (18). When n is even number, the binary secret data is between (-U n ) (1010... 1100) and (U n ) (1010... 1100), otherwise (11010... 1100) through (11010... 1100). The payload of EMSD algorithm is computed using (19).
if n is even, , The EMSD algorithm has the embedding and extracting process shown in the following; Embedding Process Input: W x H size of cover image (C I ), adjacent pixels (n), Secret Data (SD), Output: W x H size of stego image (S I ) Step1: Calculate M n value using (18).
Step4: Calculate the difference value (diff) between g EMSD and SD i using (21).
Step7: Repeat step2 to 6 until all secret data is embedded into cover image. Finally, the stego image is obtained (S I ).
Extracting Process Input: W x H size of stego image (S I ), adjacent pixels (n), Output: Secret Data (SD) Step1: Calculate M n value using (18).

III. PROPOSED ALGORITHM
In this section, a new hybrid LSB substitution and EMSD based high capacity data hiding algorithm is proposed. The principal aim of the proposed LSB substitution and EMSD based algorithm is that makes a serious enhancement in the embedding capacity while providing acceptable visual quality. In the proposed algorithm, the bits of secret data are hidden using a group of n-adjacent pixels obtained from the cover image and k-least significant bits. In the proposed algorithm, firstly, the temporary pixel values for n adjacent pixels are calculated using the k value. If the stego pixel values are likely to generate FOBP p k < 0 or, p k > 255 , the temporary pixel values are updated in a way that does not generate FOBP. Then, data hiding is performed on this pixel value using the EMSD algorithm. Finally, the k-bit of secret data is hidden each pixel in n adjacent pixel groups by the LSB substitution algorithm. The payload of the proposed algorithm is calculated using (24) since we are using both EMSD and LSB algorithms.
The proposed algorithm has the embedding and extracting process shown in the following; Embedding Process Input: W x H size of cover image (C I ), adjacent pixels (n), (k) value for least significant bits, Secret Data (SD), Output: W x H size of stego image (S I ) Step1: Calculate M n value using (18).
Step6: We calculate the difference value (diff) between g temp EMSD and SD i using (28).
Step7: Convert diff value to binary system, and use modified weight minimization algorithm [38] for EMSD algorithm. diff value (n) bits binary data (s n , s n−1 ,.  (29) Step9: nxk bits of data (b) are taken from secret data (SD).
Step10: It is used (29) to hide k-bit secret data for each pixel using the LSB substitution algorithm.
Step11: Repeat step2 to 10 until all secret data is embedded into cover image. Finally, the stego image is obtained (S I ). Figure 1 shows flowchart of the proposed algorithm for embedding process.
Extracting Process Figure 1 shows flowchart of the proposed algorithm for embedding Input: W x H size of stego image (S I ), adjacent pixels (n), (k) value for least significant bits Output: Secret Data (SD) Step1: Calculate M n value using (18).

IV. EXPERIMENTAL RESULTS AND COMPARISONS
We present comparisons and experimental results to evaluate the proposed algorithm in this section. All experiments results are performed by Matlab R2015b in a desktop computer with an Intel(R) Core(TM) i5-7400 CPU @ 3.0 GHz, 4 GB RAM and Windows 10 Professional 64-bit operating system. The proposed algorithm is tested on a series of standard grayscale cover images to evaluate by the data hiding measurement metrics such as embedding capacity (EC), payload (P), peak signal to noise ratio (PSNR), and structural similarity index measure (SSIM). Figure 3 shows eight 512 × 512 grayscale cover images, which are used to hide secret data with various embedding capacity in the experiment. Also, the secret data, generated randomly by the computer, are hidden inside these cover images in experimental studies.
The PSNR value, which is one of the commonly used evaluation metrics in image steganography studies, is used to evaluate stego-image quality. The high PSNR value depicts better stego image quality. Generally, when the PSNR value of the stego-image is higher than 30 dB, the cover and stego images are considered visually indistinguishable. Formulas showing the calculation of the mean square error (MSE) and PSNR values are given in (34) and (35). M and N values represent the size of the images in (34) [10], [38].
The SSIM value is another metric used to show the similarity ratio between cover and stego image. SSIM values are between 0 and 1, which is close to 1 indicates that the original and result images are akin. The SSIM value is computed as presented in (36) [10].
SSIM(x, y) = (2µ x µ y + c 1 )(2σ xy + c 2 ) (µ 2 x + µ 2 y + c 1 )((σ 2 The EC is described as the total secret bits that is hidden into the cover-image, while, the P value in (37) is symbolized bit per byte (bpb), the secret bits number that is embedded in a gray image pixel.
The PSNR, SSIM, EC and P results are analyzed according to different k and n values in experimental studies. The performance of the proposed algorithm is evaluated according to maximum embedding capacities with the parameters k (k=1, 2, 3) and n (n=2, 3,4,5,6,7). If the k value is more than 3, the PSNR value of the stego image is lower than 30 dB.
The PSNR, SSIM, EC and P results are analyzed according to different k and n values in experimental studies. The performance of the proposed algorithm is evaluated according to  maximum embedding capacities with the parameters k (k=1, 2, 3) and n (n=2, 3, 4, 5, 6, 7). If the k value is more than 3, the PSNR value of the stego image is lower than 30 dB.
To appreciate the achievement of our algorithm in different cover images, Tables 1 to 3 compares according to SSIM and PSNR values under k = 1 to 3, and n = 2 to 7, respectively. Even if a lot of secret bits is hidden in the cover-image, the stego-image quality appears to be acceptable. In Table 1 presents PSNR value of the proposed algorithm provides about 43 dB when k = 1 and n between 2 and 7, which indicates that the stego-image quality is good. Also, the SSIM mean value is 0.985, it means that the cover and stego images are highly similar.
According to Table 2, PSNR and SSIM mean values are calculated as 36.88 and 0.984, respectively, when k = 2 and n between 2 and 7. According to Table 2, the stego image has little distortion and the human eye cannot perceive it, also the two images are visually indistinguishable.
In Table 3, PSNR value of the proposed algorithm provides higher than 30 dB when k = 3 and n between 2 and 7, which shows that the stego image quality is acceptable. When Table 3 is examined, both high data embedding capacity and higher PSNR value (31.20) are obtained for n = 2 compared to other n values. The SSIM average value for all cover images and n values is obtained about 0.82, which emphasizes that the original and result images are highly similar. Table 4 indicates comparisons of payload for different data hiding algorithm. The proposed algorithm can be used to hide data securely at high capacity in cover images. Table 4 shows that it has higher capacity when compared with EMD [34], GEMD [35], SMSD [36], enhanced GEMD [37], and EMSD [38]. However, in the CRT-EMD study [18], n values (except n = 2 and 3) are not suitable because of the poor stego image quality and lower PSNR value (<30dB). The proposed algorithm offers high payload for k=1, 2.107 -2.404 (bpb), for k = 2, 3.107 -3.404 (bpb), and for k=3, 4.107 -4.404 (bpb), respectively. In this paper shows that the proposed algorithm for k = 1 value is utilized if better stego image quality is needed and for k = 2 or k = 3 will be employed if higher embedding capacity is required.    Table 5 presents comparisons of embedding capacity between the proposed algorithm and different data hiding algorithm. Embedding capacity (EC) offers the total amount of secret bits hidden in a 512 x 512 grayscale cover image in table 5. The proposed algorithm has a higher embedding capacity compared to other algorithms when k = 2 and k = 3. The CRT-EMD algorithm embedding capacity is better for k=1, but at other k values, the proposed algorithm has higher embedding capacity. It has more than twice embedding capacity EMD, GEMD, SMSD, and EMSD algorithms. Especially, the embedding capacity of the proposed algorithm is about 262,144 to 786,432 bits larger than that of EMSD algorithm which bases on the proposed algorithm. Figure 4 presents the relationship between the number of pixels and the embedding capacity for different algorithms, including EMD [34], GEMD [35], SMSD [36], Enhanced GEMD [37], CRT-EMD [23], EMSD [38], and the proposed algorithm. When the pixel number value (n) is increased for all algorithms (except CRT-EMD), the maximum embedding capacity decreases. In this study, the maximum embedding capacity is given as Kbit in figure 4. It is seen in figure 4 that the algorithms used for comparison in experimental studies have embedding capacity between 150 Kbit and 500 Kbit. Also, it is seen that the maximum embedding capacity varies between 600 Kbit and 1200 Kbit with the proposed algorithm. Figure 5 presents the relationship between payload and PSNR for different 'k' values of the proposed algorithm. In the study, the payload between 0.3 and 4.5 (bpb) is embedded in cover images and PSNR values are obtained. All cover images used in experimental studies give similar PSNR values at the same payload, so the graphics of Lena and Baboon cover images are presented in Figure 5. When the graph is examined, it is seen that PSNR values are acceptable, even if the payload is increased. Table 6 shows that the comparisons of embedding capacity and PSNR values between different data hiding algorithms and the proposed algorithm. SMSD and EMSD algorithms appear to have higher PSNR values according to the proposed algorithm but these algorithms have a much lower embedding capacity. Although approximately twice as much data is embedded secret data compared to SMSD and EMSD algorithms, it is achieved a PSNR value above 43 dB by the proposed algorithm. In addition, the proposed algorithm has both more data hiding capacity and higher PSNR values than the CRT-EMD algorithm.

V. CONCLUSION
High embedding capacity and acceptable visual image quality are the most basic features of image steganography. In this article, a new hybrid data hiding technique based on least significant bit (LSB) substitution and enhanced modified signed digit (EMSD) algorithms is proposed to embed secret data. The purpose of our algorithm is to achieve high embedding capacity when acceptable visual stego-image quality. The experimental results indicate that PSNR values above 43dB while embedding secret data of approximately 630Kbit, PSNR value above 37dB while embedding secret data of 900Kbit, and PSNR value over 31dB while hiding 1150Kbit secret data. Especially, the embedding capacity of the proposed algorithm is about 262,144 to 786,432 bits larger than similar algorithms as the EMSD. Also, the hybrid use of the EMSD and LSB substitution algorithms provides more security for secret data, and get suitable stego-image quality.