Robust Hashing Based on Quaternion Gyrator Transform for Image Authentication

Image hashing is one of the most effective methods in many image processing applications including image recognition or authentication, tampering detection, image retrieval, etc. In this paper, we propose a novel image hash method based on quaternion gyrator transform, which is more secure and compact in addition to its robustness and discriminative properties. In this proposed method, the quaternion gyrator transform (QGT), like the traditional quaternion Fourier transform which is a linear regular integral ratio transform is applied to effectively extract image features. Firstly, our hash function needs to scale all pictures to a fixed size. And the quaternion gyrator transform is used to transform each non-overlapping block in the original image for extracting the feature map of the source image. For each block, the inner product of the feature and a random weight is used to be one bit of the final image hash. Experiments with different image databases are conducted to test and verify the efficiency of our method. The result has proved that our method of constructing hash is robust to most of the content-preserving operations with a good distinction. Compared with some state-of-the-art hash algorithms, our algorithm is more excellent and has better performance in robustness and distinguishability.


I. INTRODUCTION
The continuous development and widespread popularity of mobile devices have brought convenience to people's lives. However, many new problems were born at the same time. For example, the huge amount of data brings huge pressure to the server storage, especially the problem of insufficient storage caused by multiple users repeatedly storing the same pictures on the server. Meanwhile, a large number of ultrahigh-definition pictures on the network may be damaged during transmitted. The attacker maliciously tampers with the image without changing the original semantic information of the picture. Therefore, it is very necessary to research and explore a new and effective technical method for image compression and image authentication.
In recent years, in order to solve the problem of image authentication, many solutions have been proposed, which have achieved good results under sufficient research. For The associate editor coordinating the review of this manuscript and approving it for publication was Abdel-Hamid Soliman .
image authentication, digital watermarking is one of effective solutions. Digital watermarking [1], [2] can be effectively embedded in the image with little change in the original picture. We authenticate the image by detecting the watermark to determine whether the image has been changed. However, some applications do not allow any small changes in image content and quality.
A more effective technical method is image hashing. The nature of image hashing can effectively reduce the pressure of storage and retrieval. For the image retrieval [3]- [9], experts design many adversarial hash learning models that receive good results. The hash algorithm also has some applications in video. Song et al. [10] propose a novel unsupervised video hashing framework dubbed Self-Supervised Video Hashing (SSVH). This algorithm adequately explores the temporal order of video frames in an end-to-end learning-to-hash fashion. Zhou et al. [11] propose a new entropy equilibrium optimization (EEO) methodology to enhance the coding performance of VR360 videos. Based on this method, he designs two algorithms, EEOA-ERP and  [12] determines that this scheme can resist small-angle image transformation problems. The discrete Fourier transform (DFT) based on the polar system and random key [13] ensures the security of hash. Quaternion discrete Fourier transform (QDFT) based on log-polar system [14] can effectively maintain the robustness of hashing. Tang et al. [15] propose a method based on visual attention and invariant moments, using a model presented by Itti et al. [16] to give weight to the LL sub-band of one level 2D discrete wavelet transformation (DWT), and using invariant moments to parse to improve the robustness and distinguishability of hash. Using the center-symmetric local binary patterns (CSLBP) to extract features and singular value decomposition (SVD) [17] has a good effect on resisting large-angle rotation and noise, contrast, brightness, compression. Tang et al. [18] utilize DWT and phase spectrum of Fourier transform (PFT) to collaboratively extract image features and ring partition to construct the final hash, which improves the distinguishing effect compared to other known methods. Quaternion singular value decomposition (QSVD) [19] is first applied to image hashing. After converting the image to the CIELab color domain, using QSVD to decompose each block to construct a hash, it also improves the discrimination ability and stability. Hamid et al. [20] exploit that the Laplacian pyramid is used to generate image hashing firstly. Two different pyramids can be generated by using different filters. The difference between the two Laplacian pyramids are calculated to obtain a unique and robust hash, which prevents non-malicious operations and detect minor tampering. One algorithm based on local color features [21] considers all the components of the color image and has a good distinction. Image hashing based on the color vector angle [22] divides the image into blocks to extract vector information and uses discrete wavelet transform to compress the feature matrix to form the image hash, which can resist normal digital operations. Laradji et al. [23] propose the quaternion Fourier transform (QFT) to calculate the hash. QFT hash improves discriminating ability, but its robustness is not good enough. Ouyang et al. [24] exploit that quaternion zernike moments (QZMs) offer a sound way to jointly deal with the three channels of color images without discarding chrominance information. This hash function based on QZMs provides a short hash and is robust to most common image content-preserving. Yan et al. [25] design the quaternion Fourier-Merlin transform to deal with localized tampering. Karsh et al. [26] extract global and local features through ring division and Markov probability to construct a hash for authentication. Qin et al. [27] exploit block truncation coding to design an image hashing algorithm with good perceptual robustness. Li and Guo [28] use sparse coding to construct an image hash for content recognition.
Since most of recent image hashing algorithms converted the color image into gray-scale one and lost the structure information of three color channels and the chrominance information, they didn't reach satisfactory performances on processing color image. Focusing on the problems mentioned above, we utilize QGT to design an image hashing algorithm. Firstly, quaternions can obtain the important structural information of the color image and receive the optimal representation of the color. Many key issues have been well solved taking the advantage of this skill, such as image watermarking [29], image recognition [30], image quality assessment [31]. Secondly, QGT can be realized by phase modulation and quaternion discrete Fourier transform(QDFT). Accordingly, low-frequency coefficients of QGT embodies most information of image and important image features.
The contributions of our study are as follow: (1) We propose a robust image hash algorithm based on the QGT to convert images to binary codes for image authentication. This is the first method to capture the representation of image content by QGT. (2) Our method generates the binary codes in an unsupervised fashion. As a result, we address the problem that large amounts of data in supervised learning need to be labeled. (3) By performing QGT with a rotation angle of α on the color image, the security of our proposed hash algorithm can be guaranteed by it. Many experiments are conducted to verify the efficiency of our function, the result shows that our proposed algorithm building a novel and compact image hash has greatly improved on robustness to content-preserving operations and discrimination compared with other existing hash algorithms.

II. RELATED WORK
Quaternion [32], an extension of a complex number, is a hypercomplex number, which was first described in 1843 and defined in a four-dimensional space by Hamilton. The structure of the quaternion can well represent the color structure information of the color image. Based on this characteristic, the quaternion is widely used in image processing. From the perspective of mathematics, a quaternion q is represented as: Among them, q 0 , q 1 , q 2 , q 3 are real numbers, and i, j, k are imaginary parts, which satisfy the following: At the same time, quaternions have the following operations: • scalar multiplication: λq = λq 0 + λq 1 i + λq 2 j + λq 3 k For color images, we can use pure quaternions to represent a pixel: where R, G, and B functions respectively represent the red, green, and blue channels of the RGB image at this point (x, y). The Gyrator transform [33] originates from the first-order optical system for the free combination of generalized cylindrical lenses studied by the Spanish scholar Rodrigo et al. in 2006. Because the parameter of the transformation is the rotation angle, it is named Gyrator transformation. Generalized the gyrator transform in two dimensions to the gyrator transform in the quaternion domain [34]: is the kernel function of the transformation, and its expression is: where µ is a unit quaternion, µ 2 = −1. At the same time, the quaternion gyrator transform and the quaternion Fourier transform have some relationship represented in this paper [34], that the quaternion gyrator transform with the rotation angle of f (x, y) is equivalent to a three-step operation, performing a phase modulation, undergoing a quaternion Fourier transform, and performing a phase modulation on this function.
where h(x, y) and g(u, v) is:

III. THE APPROACH
As is shown in Fig. 1, the image hash algorithm proposed in this article consists of three steps. In the first step, we will process the input image into a uniform size by bicubic interpolation, and divide the normalized image into image blocks that do not overlap each other. In the second step, the local block features map is extracted through the quaternion gyrator transform. In the third part, by compressing the feature map to a vector and inner product with the random vector, it is converted into the final hash by the binarization function. The difference between hashes will be calculated by the Euclidean distance. Our algorithm is described in detail below.

A. PREPROCESSING
Since the actual pictures will have different sizes, we need to first unify the size of the input image through the bicubic interpolation method to ensure that the hash generated later is a fixed size. All input images are uniformly scaled to the size of M * M , and then the input image is divided into N * N subblocks by non-overlapping blocks. Thus, the total number of sub-blocks is k (k = (M /N ) 2 ). For example, if the input image size is 64 * 64 and the block size is 8 * 8, the number of blocks is 64. Different block sizes will affect the robustness and distinguishability of the final hash, and we will verify the optimal block strategy through experiments. For each pixel of the input image, the color domain is converted to the quaternion domain by Eq. (1).

B. FEATURE EXTRACTION
The feature extraction part uses the QGT transformation with angle α. The quaternion gyrator transformation can be realized by dividing the single-channel gyrator operation into three times [34]. We adopt another way to realize the quaternion gyrator transform through the quaternion Fourier transform.
That is QGT can be realized by calculating the quaternion Fourier transformation twice. Suppose G(x, y) is the feature matrix of f (x, y) transformed by QGT. We get the feature matrix of the real domain by the following formula.
G(x, y) = mod(G(x, y)) where 'mod' is the quaternion norm mentioned above. For each block of the input picture B i , the above operations are performed. The features with a size of r * r in the center area of the converted matrix are intercepted in each block and expanded to a one-dimensional vector to obtain the final feature vector C i (i = 1, 2 . . . k). We normalize the data of each vector.
where u i , δ i are respectively the mean and standard deviation of C i and defined as follows: λ is a very small value to prevent division by zero. r means the size of the center of the blocks.

C. HASH GENERATION
For each normalized feature vectorĈ i , we use the following formula to get the approximate hash value of the first step: where L is a random vector of length r 2 generated by a positive integer key. Next, we take the following binarization operation for each H B (i):Ĥ where j = 1, 2 · · · k.

D. SIMILARITY EVALUATION
The similarity between the pictures is calculated by the Euclidean distance of their corresponding hashes. For example, the hash vectors of two pictures are as follows: H 1 , H 2 , then the formula is as follows.
where h 1 (l) and h 2 (l) are the lth element of hashes. We set a threshold T . If the distance is greater than T, it is a different image, otherwise, it is a similar image.

IV. EXPERIMENTS
In this experiment, we set M = 512, N = 64, k = 64, α = π/2, λ = 0.001 for the algorithm proposed above. As a result, we finally get an image hash, the length of which is 64 bits. We will discuss the robustness and discriminative of this algorithm separately in Section A and B. We will discuss in Section C whether rounding and binarization will reduce accuracy while reducing storage pressure. Section D gets the verification of hash security. The receiver operating characteristics (ROC) [35] is to evaluate the effect of different thresholds. The x-axis is denoted by the false positive rate (PFPR) and the y-axis shows the true positive rate (PTPR). They indicate the ability of robustness and discrimination separately and are defined as follows: We utilize the area under the ROC curve (AUC) [35] for quantitative comparison. The range of AUC is [0, 1], and the value of AUC implies better performance if bigger.

A. ROBUSTNESS
We adopt the Kodak database [36] to verify the robustness of our hash algorithm. The image database contains 24 images, and its size is 512 * 768 or 768 * 512. Fig. 2 represents some images of this database. We employ content preservation operations to simulate robust attacks to generate additional image sets. The operations used include JPEG compression, Gaussian noise, Salt and Pepper noise, Gaussian filter, Median filter, Image scaling, Image rotation, Contrast adjustment, Brightness adjustment, etc. More details will be reflected in Table 1 in which there are 58 operations in total. In this case, the image collection of this experiment contains a total of 24 * 59 + 24 = 1440 images. Fig. 3 shows the distribution of the Euclidean distance of different operations with different parameters, the x-axis is the parameter transformation, and the y-axis is the average distance of the 24 images from the original image under such operations. As shown in Fig.3.j, we set different parameters from 0.6 to 1.3 in the brightness adjustment. It can be seen that the mean values of normalized Euclidean distances of similar images under different parameter values fluctuate from 0.01 to 0.015. The blue line also means that the average Euclidean distance of all images hash in this dataset under all the different parameters is smaller than 0.02. In a word, the distance under all operations will not exceed 0.025. When we set the threshold to 0.022, our algorithm distinguishes whether it is a similar image with a probability of 99.9%. In a word, our algorithm performs well in robustness.

B. DISCRIMINATION
Here we use the UCID database [37] to verify the distinguishing effect of our algorithm. The image size of the data set is 384 * 512, or 512 * 384. The data set has 1338 color images in total, partly shown in Fig. 4. Each image hash will calculate Euclidean distance from other different image hashes, so we get 1338*1337/2=894453 results. The total distribution is  shown in Fig. 5(b). At the same time, we copy the operation of the robust experiment and display its distribution diagram in Fig. 5(a). Among the two figures, the x-axis is the distance, and the y-axis is the frequency. As described in these two pictures, the minimum distance between different pictures is greater than 0.03, and the highest value is 0.093, the average value is 0.066, and the variance is 0.008. So the hash distance for different images is much larger than it of similar images. This also proves that our hash scheme is very effective in distinguishing. Table 2 shows that as the threshold changes,  the probability that similar pictures are judged to be similar is increasing, while the probability that different pictures are considered similar pictures is decreasing. It also told us that when the threshold = 0.022, the error rate is the lowest. From the distribution histogram, we can also see that the 0.022 threshold is the most appropriate.

C. EFFECT OF HASH LENGTH ON PERFORMANCE
In the algorithm, we have a rounding and binarization process for the hash. As shown in Fig. 6, the effect of the two aspects after rounding will be better than or approximately equal to that of the unrounding's, and the effect after binarization is better than that without binarization. For unrounded matrices, the storage of a floating-point number generally consumes four bytes or eight bytes, one of which contain 8 bits. so the minimum per hash will occupy 64 * 8 = 512 bits. After rounding, the integer in all hash vector range from -8 to 16, which requires at least 5 bits to represent it, then a hash has 5 * 64 = 320 bits. For the binary hash, only 64 bits are needed to represent, greatly reducing the space consumed by hash storage. Meanwhile, there is no negative effect on robustness and discrimination.

D. SECURITY
Color image Lena is selected as a test image to test the security of our image hashing. In the following steps, a group of secret keys is firstly used to control the hash generation of Lena. Then, another 100 different groups of secret keys are exploited to extract image hashes of Lena. Finally, the Euclidean distances between the first hash and the other 100 hashes are calculated. The results are presented in Fig. 7, where the x-axis is the index of wrong keys and the y-axis is the Euclidean distance. It is observed that the maximum and minimum ones are 0.39105 and 0.01408, the mean and standard deviation are 0.13468 and 0.09153 respectively. The mean distance is much bigger than most mean values of similar images. This illustrates the good security of our algorithm.

V. PERFORMANCE COMPARISONS
To show the advantages of our algorithm, we select some excellent algorithms to compare. The selected algorithms include Random-walking hash algorithm [38], QSVD hash algorithm [19], SVD-CSLBP hash algorithm [17], Laplacian Pyramids hash algorithm [20], A hash algorithm based on deep convolutional neural network(DCNN) [39]. These algorithms above are hash algorithms recently published in journals or conferences. In order to show the fairness of the experiment, we uniformly set the image size to 512 * 512 to conduct the experiment. As shown in Fig. 8, we will compare  the different algorithms in the form of ROC curves. The ROC curve of our algorithm is closer to the upper left corner of the hash algorithm than other comparison curves. Therefore, it can be intuitively concluded that our algorithm has better classification performance than the comparison algorithm. For quantitative analysis, the AUC of all hash algorithms is also calculated. Their AUC and ours are (0.949), (0.988), (0.996), (0.99928), (0.864), (0.904), (0.979) and (0.99968). At the same time, we use the same machine to measure the average time of hash generation by different algorithms. As shown in Table 3, the shortest hash length in our optimal case is only 64 bits. So our hash algorithm is better than other hash algorithms overall.

VI. CONCLUSION
In this paper, we propose a robust hash algorithm based on quaternion gyrator transform for image authentication. After a lot of tests, experiments show that our algorithm is robust to JPEG compression, contrast adjustment, filtering, and other content-retention operations. The comparison of ROC curves shows that our proposed hash algorithm is superior to the existing methods in robustness and discrimination. Meanwhile, our hash code is more compact and shorter. Future work will be focused on two subjects. First, we will apply the idea of this hash algorithm to CBIR(Content-based image retrieval). Second, due to the excellent performance of our method in image processing, we will introduce a content-based model to video hashing. JUNLIN OUYANG received the M.S. degree in computer application from Central South University, China, in 2007, and the Ph.D. degree in computer science from Southeast University, China, in 2015. He is currently a Lecturer with the School of Computer Science and Engineering, Hunan University of Science and Technology, China. His research interests include color image processing, information security and image retrieve, deep learning, image forensics, and so on.
XIAO ZHANG is currently pursuing the master's degree with the School of Computer Science and Engineering, Hunan University of Science and Technology. His research interests include digital image technique, information security, deep learning, and so on.
XINGZI WEN received the M.S. degree in business administration, in 2007, and the Ph.D. degree in management science and engineering from Central South University, China, in 2014. She is currently a Lecturer with the Hunan University of Science and Technology, China. She is mainly engaged in research work in the software industry cluster, decision science, and complex networks.