New High Capacity Reversible Data Hiding Using the Second-Order Difference Shifting

Reversible data hiding ( RDH ) algorithm is developing rapidly for many years. Usually, the histogram shifting ( HS ) based RDH algorithm has two main steps: Firstly, a steep difference histogram is generated with some effective methods for difference. Secondly, by extending and shifting some differences in the histogram, bits of information can be embedded into the cover image which can be restored reversibly. In this paper, we proposed the second-order difference to obtain a steeper difference histogram. Firstly, we slide a window with size 2 × 2 through the image. For each block, we can get the two ﬁrst-order differences by calculating the absolute values of the two differences for its two columns. Thus, the second-order difference of each block, which is the absolute values of the difference of the two ﬁrst-order differences, can be obtained. By extending and shifting the second-order difference, a bit may be embedded into the block ﬁnally. Experiments reveal that the proposed algorithm outperforms the previous state-of-the-art RDH methods in terms of the computational complexity, image distortion and the embedding performance.


I. INTRODUCTION
Information security is becoming more and more important as the rapid development of information technology, so data hiding (DH) technology is becoming the best method to protect the information transmitted through public media [1], [2], where secret message is embedded into a cover digital medium such as image, video or text to produce a corresponding marked-medium. Usually, image is taken as the embedding cover medium to hide data. Recently, DH schemes are classified into two main types: irreversible data hiding (IDH) schemes [3], [4] and RDH schemes [5], [6]. For some DH applications, such as a communication, medical [7] or military application, their original image must be recovered without any distortion, so RDH methods are developed rapidly for the recipient cannot completely recover the cover image with the IDH schemes. In general, the existing RDH algorithms can be classified into three groups: difference expansion (DE), HS and encrypted RDH methods.
The associate editor coordinating the review of this manuscript and approving it for publication was Jiachen Yang .
The DE method is first proposed by Tian [8], where the difference between two adjacent pixels is expanded to embed a secret message. However, it will suffer from undesirable distortion for embedding a location map. Then, Alattar developed it for color image data hiding [9], where the differences of four pixels are employed to embed 3 secret message bits and improved the embedding capacity (EC). And to improve the distortion performance at low embedding capacities, Thodi et al. [10] embedded data with prediction error expansion (PEE) method instead of DE method, which makes use of the correlations among the adjacent. Later, Zhang et al. [11] computed the values of pairs of difference to generate a two-dimensional-difference histogram, and designed a specifically difference-pair-mapping (DPM) to hide data. In 2013, Ou et al. developed PEE to two dimensional-prediction-error expansion [12]. To obtain a sharp PE histogram and enhance the embedding performance Hu et al. [13] optimized histograms modification scheme to propose a minimum rate criterion RDH algorithm. To make better use of the pairs of difference with high frequencies, Xue et al. [14] proposed a difference pair mapping (DPM) RDH method, which employs a fine adjusting strategy VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ for the optimization of embedding. In [15], Wang et al. introduced a DE scheme to resist unintentional attacks. To enhance the quality of the marked image, Maniriho et al. [16] reduced DE to introduce a pixel-block based RDH method. Xiao et al. [17] optimize the path of expansion bins selection according to a specific division for two dimensional prediction error histogram, and with these optimal expansion bins, they can adaptively determine the histogram modification mapping. Recently, various DE-based RDH algorithms are developed rapidly [18]- [21].
Ni et al. [22] proposed the first HS based RDH algorithm, which employed the high frequencies in image histogram for embedding data. Nevertheless, it has low EC for the histogram of cover image. To fully make use of the correlations among the adjacent pixels, Hong et al. [23] improved the prediction accuracy with a set of basic pixels for embedding payload. Then, by changing the order of the Markov model, Wang et al. [24] designed a HS scheme to achieve an efficient tradeoff between the quality of marked image and EC. In [25], Li et al. designed a HS technique to construct some special shifting and embedding functions so that some conventional HS algorithms can be taken as specific cases. Unlike the previous methods, Zhang et al. [26] collected the pixels with a given complexity to generate a prediction error histogram (PEH), and adaptively selected the expansion bins to minimize the embedding distortion. In [27], Kim et al. employ the median edge detection to generate the predicted image of cover image so that the secret data can be embedded into the peak points. Later, Wang et al. [28] introduced a rate and distortion optimization model and employed genetic algorithm to search the nearly optimal zero and peak bins for the improvement of EC and the quality of marked image.
To increase the quality of the marked image, Kumar et al. [29] construct the segments of size 2, 3, 4 and 5 elements based on the human visual and then secret data is embedded in them. Wang et al. [30] use fuzzy C-means (FCM) clustering method to classify the cover carriers into different clusters for building the multiple histograms. Recently, other HS-based RDH algorithms are developed [31]- [33].
By embedding the secret data into an encrypted image, the encrypted-image based RDH scheme was firstly proposed by Zhang et al. [34]. Then, by using public key cryptosystem, Chen et al. [35] presented an encrypted signal-based RDH scheme according to Paillier homomorphic encryption, which improved the payload and signal quality. In [36], the secret data were embedded into the encrypted image with additive modulo 256 for encryption, and the secret data were extracted and the cover image was restored with preserved mean value. Later, Zhou et al. [37] embedded secret data into an encrypted image with a public key modulation mechanism. And using compressive sensing and discrete fourier transform, Xin et al. [38] could embed secret data into an encrypted image, which employs both real and imaginary coefficients to recover the original image and provides flexible payload. In the recent years, many other RDH algorithms and effective encrypted RDH algorithms are developed [39]- [41].
For the encrypted technique, the encrypted image is first obtained from the original image with a special encryption method, which is taken as a cover image to embed secret data by a DE or HS embedding method. And for HS-based technique, it selects pairs of peak and zero bins from the histogram to be expanded for embedding and shifts other bins. Generally, DE-based technique selects the peak and zero bins at the center and tail of the histogram for embedding, so it can be taken as a special HS-based technique. As we know, the performance of HS-based RDH algorithm relies heavily on the steepness of histogram and the method for the decision of the pairs of peak and zero bins. To obtain a steeper histogram, a novel HS-based RDH based on the second-order difference expansion (SODE) is proposed.
Next, we'll organize this paper as: the related works will be described in Section 2. Section 3 will introduce the basic fundamentals of our new scheme. Then, we will discuss and analyze the parameters of our new scheme in Section 4. The comparative experiments and analysis will be carried out in Section 5. And Section 6 will summarize the conclusions.

II. RELATED WORKS
Two DE-based RDH algorithms will be discussed in this section.
A. OU et al. 's PAIRWISE ALGORITHM [12] In a cover image, the prediction error of each pixel x i is calculated by wherex i is the prediction of x i calculated with a particular strategy. So, the PEH can be generated with where # is the frequency of prediction errors. Then the prediction error e i can be expanded to embed one secret bit b or not by where T is an integer parameter to control the embedding capacity. With the prediction error sequence (e 1 , e 2 , . . .

III. THE PROPOSED METHOD
To take advantage of the correlations within prediction errors, the 2D-PEH methods in the related works expand or shift bins in two directions, which provided lessons for our new algorithm. However, their PEH will not be changed, so their embedding performances are not significantly improved. Based on the idea of pairwise prediction error methods, we use two pairs of difference errors to form a second-order difference to obtain the steeper distributed histogram for the better embedding performance.

A. SECOND-ORDER DIFFERENCE
For a cover image I with size w × h, where w and h are the width and height of I respectively, a sliding window sw i,j with size 2 × 2 slides over the image I in raster scanning order as shown in Fig. 3. When it slides to the pixel p i,j , which is the pixel at the location (i, j) of image I , the sliding window includes the four pixels (p i,j , p i,j+1 , p i+1,j , p i+1,j+1 ). Then the first-order differences (e1 i,j , e2 i,j ) can be obtained by where function abs(x) returns the absolute value of the variable (x). Then we can get the second-order difference with Let k = i * (h − 1) + j, then, we can get the second-order difference sequence (d 1,1 , d 1,2 , . . . d h−1,w−1 ), which can be used to generate the second-order difference histogram (SODH) for embedding. So, the second-order difference occurrences are counted and the corresponding SODH is defined as: where # is the frequency of the second-order difference sequence.
As far as we know, there is a great relationship between the difference histogram distribution and the embedding performance. The steeper the SODH distribution is, the smaller the image distortion is namely the better the embedding performance of the algorithm is. Moreover, the steepness of the SODH for a cover image can be expressed by standard deviation. According to Eq. (1), the steepness of the related works can be calculated by whereē1 andē2 are means of the first-order difference sequences (e1 1,1 , e1 1,2 , . . . e1 h−1,w−1 ) and (e2 1,1 , e2 1,2 , . . . e2 h−1,w−1 ) respectively. And based on Eq. (8), the steepness of the proposed method can be calculated by whered is the mean of the second-order difference sequence Second-order difference makes use of the correlation among 4 adjacent pixels, while first-order difference only makes use of the correlation between 2 adjacent pixels, so the correlation of second-order difference is higher than the firstorder difference. To further improve the correlation, some FIGURE 5. The second-order difference expansion. To turn the second-order difference into an even number for embedding secret bits, the pixel pair with the larger first-order difference was expanded by half of the second-order difference in both directions.
literatures make use of rhombus prediction error, but the pixel values are changed considerably when we expand the bins, which will increase the distortion of the cover image. For example, for the standard deviation of the standard gray-scale image Lena with size 512 × 512, its standard deviations of first-order adjacent pixel prediction error, rhombic prediction error and the second-order difference are 71.9, 26.0 and 39.2 respectively. The PEHs and SODH of the image Lena are tested as Fig. 4. The standard deviation of the second-order difference is greater than that of the rhombic prediction error, and its histogram is closed to that of rhombic prediction error, rhombic prediction error can only be extended in one direction, that will change the pixels considerably, while second-order difference can use the bidirectional expansion embedding method as following, so it will result in low distortion.

B. BIDIRECTIONAL DIFFERENTIAL EXPANSION AND EMBEDDING
Based on SODH, we select some highest bins for expansion embedding while shift the other bins to create vacancies, so, for each second-order difference d i,j , the marked secondorder difference d i,j is calculated by where b ∈ {0, 1} is the secret bit to be embedded, and T is a non-negative integer parameter used to control the embedding capacity. To further reduce the distortion in the embedding process, we divide the expanded difference by halves and assign them to the pixel pair with a larger firstorder difference, respectively, as shown in Fig. 5. So, the four marked pixels of the two pixel pairs used to calculate the second-order difference d i,j can be obtained with Based on this embedding method, some marked pixels may be beyond the range [0,255] of gray-scale image pixel. That means they encounter the overflow/underflow problem. For brevity, to deal with the overflow/underflow problem, the same method as [12] was used in our proposed algorithm.

C. BIDIRECTIONAL DIFFERENTIAL COMPRESSION AND EXTRACTION
Similarly, for a marked image I with size w×h, where w and h are the width and height of I respectively, a sliding window sw i,j with size 2 × 2 slides over image I in the reverse order of Fig. 3. When it slides to the marked pixel p i,j , which is the pixel at the location (i, j) of the marked image I , the sliding window includes the four pixels Then we can get the second-order difference with According to Fig. 5, an expanded difference d i,j must be even before a secret bit is embedded in as the difference is doubled in both directions. Based on Eq. (14) and (15), if a secret bit 1 is embedded, it must be odd. So, if the marked second-order difference d i,j ∈ [0, 2T + 1], we can extract the embedded bit and compress the marked second-order The function mod(x, y) returns the modulus after division of x by y. Then, the cover pixels can be restored as

D. EMBEDDING CAPACITY AND DISTORTION
For the pixel p i,j , after embedding 1 bit of information, d i,j+1 may be changed, the probabilities of its value increasing or decreasing are 50 percent respectively. Based on the above embedding process as shown in Eq. (14) and Eq. (15), when d i,j = k ∈ [0, T ], we can embed a bit into two pixels, thus, the embedding capacity (EC) can be approximated as where hs (k) is the second-order difference occurrences as shown in Eq. (9). In terms of probability, half of the embedding bits are '0's, and the other half of them are '1's. So, its embedding distortion (ED) can be approximated as

E. EMBEDDING ALGORITHM
Given gray-scale image I and a non-negative integer parameter T , the embedding algorithm as follows Step 1 It scans image I in raster scanning order.
Step 3 For the current pixel p i,j , its second-order difference d i,j is obtained by Eq. (8).
Step 4 It continues with steps 2-3 until all the pixels have been calculated.
Step 5 The SODH of cover image I is obtained by Eq. (9) Step 6 According to this SODH, the threshold value T is determined with Eq. (21). (Note: as Eq. (21) is an approximate equation, the selected value for T in step 6 should be 1 or 2 bigger than that from Eq. (21).) Step 7 It scans image I again in raster scanning order.
Step 8 For the current pixel p i,j and its second-order difference d i,j , the corresponding marked pixels are calculated with Eq. (14) and (15).
Step 9 It continues to calculate all the pixels according to steps 7, and finally we get the marked image I .
To understand the embedding process of our proposed scheme, we take an image block sized 10 × 10 from image Lena to embed the bits of information, as shown in Fig. 6. The image block from the hat of image Lena is shown as Fig. 6 (a), the first three specific examples, where the threshold T = 5, are given as follows.
If we keep on doing that, the embedding results are obtained as Fig. 6. The first-order differences e1 i,j and e2 i,j obtained by Eq. (7) are shown as the blocks in Fig. 6 (b) respectively, so the second-order differences d i,j obtained by Eq. (8) is shown as the left block in Fig. 6 (c). If we use the randomly selected bits shown as the right block in Fig. 6 (c), then the embedded results with Eq. (14) and (15) is shown in Fig. 6 (d).

F. EXTRACTING ALGORITHM
After receiving the marked gray-scale image I , we can extract the bits of information and restore the cover image I as following: Step 1 It scans the marked image I in the reverse order of Fig. 3.
Step 2 For the current marked pixel p i,j and its corresponding pixel block p i,j , p i+1,j , p i,j+1 , p i+1,j+1 , its first-order differences (e1 i,j , e2 i,j ) are obtained by Eq. (16).
Step 3 For the current marked pixel p i,j , its second-order difference d i,j is obtained by Eq. (17).
Step 4 The embedded bit b is extracted and the secondorder difference d i,j is compressed respectively with Eq. (18).
Step 5 The cover pixels can be restored with Eq. (19) and (20).
Step 6 It continues with steps 1-4 until all the pixels have been calculated.
Step 7 The cover image I and embedded bits of information are obtained.
To understand the extraction process of our proposed scheme, we take the above pixel block in image Lena to extract the bits of information and recover the cover pixel block, as shown in Fig. 6. The image block from the hat of image Lena is shown as Fig. 6 (a), the last three specific examples, where threshold T = 5, are given as follows.

IV. DISCUSSIONS
In this paper, we use the peak signal-to-noise-ration (PSNR) to assess the performance of the RDH schemes. The mean square error (MSE) can be calculated as So PSNR can be calculated as A. THRESHOLD T As shown in Fig. 4 (b) as well as the Eq. (14) and (15), the EC and ED of our proposed algorithm depend on the parameter T . We select eight standard 512 × 512 gray-scale images as shown in Fig. 7 to test the relation among T , EC and ED, and Table 1 lists the experimental results. Let us consider image Lena, when threshold T is 1, 5, 10, 15 and 20, respectively, the EC is 108236, 212024, 252279 and 256408 bit, respectively, and their PSNR is 49.7, 43.7, 41.2, 40.4 and 39.7 dB, respectively. So, EC increases with T , and it is increasing more slowly than T does, while ED decreases with T , but it is decreasing more slowly than T does. The same results are obtained when we experiment on the other images shown in Fig. 7.
To further observe this relationship, four of the eight standard images were selected for further experiments under the condition of more T values. The experimental results are shown in Fig. 8. When T ≤ 3, EC increases rapidly as T increases and ED decreases rapidly as T increases. When T > 3, with the increasing of T , EC increases slowly and ED decreases slowly too. And for most applications, T ≤ 3 is sufficient. VOLUME 8, 2020

B. PAYLOAD-LIMITED EMBEDDING
In an application, the payload EC is usually fixed and the corresponding ED is minimized. Based on Eq. (21) and (22), the proposed method can be considered as the following optimal problem For a given payload EC while minimizing the ED, our optimal problem is to determine the minimal threshold T . Taking image Lena as an example shown in Table 1 and Fig. 8, the T s  are 1, 5, and 15 respectively when ECs are 95000, 110000 and 250000 bits respectively.

V. EXPERIMENTS AND ANALYSIS
To evaluate the performance of our proposed algorithm, we experiment it on the eight standard gray-scale mages with size 512 × 512 shown in Fig. 7, 100 gray-scale images with 85374 VOLUME 8, 2020 size 816 × 616 randomly selected from Cambridge database. Before our experiments, we will convert all of the color images into gray-scale images. The software used in our experiments is MATLAB and the experimental hardware is a PC with i7 3.4 GHz CPU and 4.0 GB RAM. The secret bits in the experiments are generated with a pseudo random number generator. To demonstrate the superiority of the proposed algorithm, five state-of-the-art RDH methods [17], [28]- [33] are selected for the comparison.

A. COMPARISON OF PSNR WITH OTHER RELATED SCHEMES AT THE SAME EC
As PSNR can objectively reflect the quality of embedded images, we first compare the proposed algorithm with the state-of-the-art algorithms in terms of PSNR at the same EC for different images. The distortion performance of these algorithms is summarized as  [16] when EC = 60000, that is to say the image Beach has higher stego-image quality than the image Baboon, as the image Beach is smoother than image Baboon, other algorithms have the same results. But for the same image, the proposed algorithm has the higher PSNRs than other algorithms at the same EC, for example, for image Peppers, when EC = 60000 bits, the PSNRs of [16], [28] and the proposed algorithm are 31.1, 45.7 and 49.7 dB respectively, and [17], [30] and [33] can't embed so much EC into image Peppers. To prove the superior performance of the proposed algorithm, we calculate the average embedding Then, we continue to compare the performance of the proposed method with the five RDH methods as shown in Fig.  9. The results show that the curve of the proposed method is not very smooth and jumps at some points. For example, when EC = 9 × 10 4 and EC = 6 × 10 4 in image Lena and Pepers respectively, the slope of both sides of these points changed greatly. That's because, for some value of T for a certain image, EC is at its maximum at these points, and in order to increase EC, we have to increase the value of T . Although some algorithms occasionally perform better than the proposed algorithm on some images, for example, in the images Peppers and Baboon shown in Fig. 9 (b) and (d), algorithm [28] is superior to the proposed when EC < 2 × 10 4 , the proposed algorithm is superior to the algorithm [28] when EC > 2 × 10 4 , which can meet most application requirements. For most images, the curve of the proposed is located at the top of the curves of the other algorithms. It shows that the embedding performance of our method is better than other methods. To demonstrate this superior performance, we continued to experiment on 100 images randomly selected from the Cambridge image database, and averaged their performance as shown in Fig. 10, the same results happened. Fig. 9 and Fig. 10 show that the ECs of algorithms [30] and [33] are less than 2 × 10 4 bits, and EC of algorithm [17] is less than 6×10 4 bits. At the same time, the curves of methods [30] and [33] are below the curve of our method. For [17], it expands a specific 2 dimensional PEH for embedding, and PEH needs to satisfy two conditions to embed a bit, so its EC is low. And after embedding, each dimension has been modified, so it has large embedding distortion. In [30], it uses Fuzzy C-means (FCM) clustering method to classify the multiple histograms, only some parts of the clusters can be used for embedding, so it has low EC. While for [33], blocks are classified into highly-correlated and lowly-correlated smooth block, and the lowly-correlated block has a low utilization rate, so it also has low EC as [30]. Both [30] and [33] use the multilayer embedding method where some pixels may be modified more than once, so they will cause higher distortion. To search the nearly optimal pairs of zero and peak, a genetic algorithm was employed in [28]. So it has higher embedding performance when EC is low. But when large payload needs to be embedded, the multiple-shifting scheme is employed, which will cause higher distortion. So its performance is worse than the proposed algorithm for most applications and images. For [16], each pixel can be embedded with one bit, but each vector is modified by twice the difference, so it has large EC and distortion. While the proposed algorithm uses the second-order difference which has steeper histogram for each image to embed bits, so it has large EC, in each block, only half of the pixels can be reduced or increased half of the difference, so its distortion is small. Moreover, as the window slides, some pixels increase and then decrease, and some pixels decrease and then increase, so the overall distortion decreases further. So our proposed method is superior to the other methods.
When the payload is small, the steepness of the histogram has little influence on the embedding performance, so the advantage of the SODH is not obvious. Therefore, when the payload is small, some of the state-of-the-art algorithms may have better embedding performance than the proposed algorithm for some images, as shown in Fig. 9 and Fig. 10. As the performance of method [16] is far lower than that of other methods, we further test the embedding performance of the proposed algorithm and the other four state-of-the-art algorithms with low embedding payloads, as shown in Fig. 11  and Fig. 12. Fig. 11 shows, for some images, the embedding performance of our proposed method is worse than these of some other algorithms, and for others, the embedding performance of the proposed algorithm is better than these of other algorithms, for example, algorithm [17], [28], [33] are superior to the proposed for image Peppers, and algorithm [17] is superior to the proposed for image Bird, but the proposed is superior to other algorithms for images Chimney and Beach. Fig. 12 shows the average embedding performance of our proposed method is better than those of the other methods. So our proposed method is superior to other methods with low payload too.
Therefore, the proposed algorithm is superior to other algorithms, especially when the embedding payload is large, this advantage is more obvious, meanwhile, the proposed algorithm is more suitable for most data hiding applications.

C. PRACTICAL EVALUATION OF COMPUTATIONAL COMPLEXITY
Based on Fig. 9 and Fig. 10, for a gray-scale image with size 512 × 512, the ECs of algorithms [17], [30] and [33]  are less than 6000 bits and 3000 bits respectively, and their embedded image quality are less than proposed algorithm. Although the EC of algorithm [16] is large, its embedded image quality is far lower than that of the proposed algorithm, so, its embedding performance is much lower than that of the proposed algorithm. When EC is less than 20000 bits, the embedded image quality of [28] may be higher than that of the proposed algorithm for some gray-scale images with size 512 × 512. To further evaluate the superior performance of the proposed algorithm, we continue to compare its computational cost with that of [28]. In the experiments, we test the computation time (CT) of the proposed and [28] with different payload on various image sizes. To ensure the effectiveness of the test, 100 gray-scale images were randomly selected from Cambridge image database, which were converted to the images with size 512 × 512, 1024 × 1024 and 2048 × 2048 respectively. And the embedding payloads used in these experiments were 0.2, 0.3 and 0.6 bit per pixel (bpp) respectively. The results of the average CT on the 100 images are listed in Table 3. It shows that: (1) the CT of both algorithms increases with the increase of the embedding payload, for the EC increases as embedding payload increases, for example, when embedding payload are 0.2, 0.3 and 0.6 bpp respectively for the image with size 1024 × 1024, the CT of proposed algorithm are 1.001, 1.397 and 2.595 seconds respectively, and the CT of [28] are 3.188, 3.608 and 4.676 seconds respectively. And (2) the CTs of the proposed algorithm are much smaller than those of [28], for example, for the image with size 1024 × 1024, the CTs of the proposed algorithm are decreased by about 68.6%, 61.3% and 31.2% compared with those of [28]. Thus the computational performance of our proposed method is far better than that of [28].

VI. CONCLUSIONS
Based on the analysis of HS and DE technologies, a novel second-order HS algorithm is proposed in this paper. With the two first-order differences of two pairs of pixels in a pixel block, we can get the second-order difference of the four pixels. As the second-order difference histogram of the image is steeper than the first-order difference histogram, the proposed algorithm has good embedding performance. When the EC is small, the steepness of the histogram has little influence on the embedding performance of the image, so the advantage of the second-order difference histogram is not obvious. Experimental results show that the proposed algorithm is superior to the state-of-the art algorithms, and it is suitable for most data hiding applications.