Robust Hashing With Local Tangent Space Alignment for Image Copy Detection

Robust hashing is a useful technique for the image applications of watermarking, authentication, quality assessment and copy detection. This article proposes a new robust hashing for image copy detection by using local tangent space alignment (LTSA). A key contribution is the weighted visual map computation based on the difference of Gaussian (DOG) and visual attention model. The weighted visual map can provide the proposed method with good robustness. Another contribution is the feature learning via LTSA from the feature matrix of the weighted visual map in discrete cosine transform domain. As it can maintain the local geometric relationships within image, the learned features can make the proposed method discriminative. Extensive experiments on public databases are conducted to validate the proposed robust hashing method. Compared with some famous robust hashing methods, the proposed robust hashing method demonstrates preferable classification performance in terms of discrimination and robustness. Copy detection performance is tested and the result verifies effectiveness of the proposed robust hashing method.

Robust hashing can derive a compact hash code from input image.It is an effective technique of image representation and has been used to solve many image applications [11], [12], [13], such as watermarking, authentication, quality assessment and copy detection.This article exploits local tangent space alignment (LTSA) to design a new robust hashing method for the image application of copy detection.
Generally, robust hashing method for digital images needs to meet two key performances [14], [15], [16], [17], which are robustness and discrimination.Robustness signifies that, for visually similar images, robust hashing method should encode them into the same or similar hash sequences.Discrimination means that, for different images, robust hashing method should convert them into different hash values.It is noticed that there is a mutual restriction between these two performances.In recent years, diverse robust hashing methods are reported to improve the two performances.Some frequently-used techniques contain Radon transform [18], singular value decomposition (SVD) [19], 2D-2D (two-directional two-dimensional) PCA (Principal Component Analysis) [20], discrete cosine transform (DCT) [21], [22], Zernike moment [23], discrete wavelet transform (DWT) [24], ring partition (RP) [25] and discrete Fourier transform (DFT) [26].The following part reviews some recent robust hashing methods relevant to our work.
To improve robustness, Davarzani, Mozaffari and Yaghmaie [27] exploited CSLBP (central symmetric local binary pattern) and SVD to build a new hashing method.Their method reaches preferable robustness, but the discrimination ought to be strengthened.Huang et al. [28] computed a block-based hash by random walk (RW) technology for enhancing the security of hash values.This hashing can ensure robustness and security effectively.Sajjad et al. [29] selected robust features by combining Canny edge operator and DCT coefficients to calculate hash code.This hashing method has a good application on industrial surveillance images, but it is only robust to a small number of digital operations.Shen and Zhao [30] first generated color features from secondary image, then used quadtree decomposition to form structural features, concatenated these two features and finally produced a secure hash sequence by encryption.Their method [30] can resist most image operations, but it does not consider the effect of rotation operation.In another work, Yuan and Zhao [31] utilized local and global features to compute hash sequences.The global features are generated by combining statistical features of different three-dimensional views with SVD techniques.The local features are obtained from the energy features of blocks.This method has a fast speed and it can detect local tampering detection.However, this method does not solve rotation robustness yet.Liu et al. [32] utilized mean pooling and local binary pattern to build a hashing method for color images.This method has limitation in resisting rotation operation.To reach robustness against rotation, Tang et al. [33] employed RP to conduct image division, selected the mean, variance, kurtosis and skewness of image ring to form feature vector, and generated a hash code through the invariant distance of feature vectors.In another work, Tang et al. [34] employed DFT and MDS (multidimensional scaling) to learn a compact hash.This method has excellent rotation robustness, but the discrimination ought to be improved.
Recently, Abdullahi, Wang and Li [35] utilized fractal coding and FMT (Fourier Merlin transform) to calculate a hash.Their hashing method has favorable robustness, but there is still room for improving discrimination.To enhance discrimination performance, Zhao and Yuan [36] employed the mean curve, valley curve, and peak curve to design a new hashing method.The use of more image features improves discrimination of their method [36].To get discriminative feature of color image, Tang et al. [37] calculated color vector angles (CVA) of color image, selected histogram of CVA (HCVA) as image feature and compressed it with DCT.The HCVA-DCT method makes full use of color information and thus reaches good discriminative capability.In another work, Tang et al. [38] utilized low-rank representation (LRR) and RP to calculate hash code.This method can solve the problem of large angle rotation, but the discrimination of this method needs to be enhanced.Motivated by the advantages of quaternion theory in the description of color image, Ouyang, Liu and Shu [39] selected Zernike moments in quaternion domain and scale invariant feature transform (SIFT) feature to design a novel hashing.Since all components of color pixels are utilized by the Zernike moments in quaternion domain, the discrimination of their method [39] is significantly enhanced.In addition, this Zernike moments-based method [39] can detect image tampering.More recently, Biswas et al. [40] generated a hash with global texture energy and dominant neighborhood structure.Biswas's method can be applied to Tor domain recognition.Wang et al. [41] combined global feature, local feature and structural feature to compute hash code.Wang's hashing method shows good performance of tampering detection and localization.Singh et al. [3] employed SVD and the KAZE feature to implement a hashing system.This hashing system has desirable robustness against gamma correction and geometric attack.Huang and Liu [42] designed a new image hashing based on gray level co-occurrence moment (GLCM) and dominant DCT coefficients [21].This hashing method obtains a relatively preferable balance between discrimination and robustness.Table I presents the core techniques of some typical robust hashing methods, where the first column is the reference number of each method, the second column presents its core techniques, and the third column is the year of publication.
The above review indicates that most robust hashing methods can not make a preferable balance between the performances of discrimination and robustness.Aiming at this, we investigate the use of LTSA and exploit it to design a novel robust hashing Compared with some famous hashing methods, the proposed robust hashing method shows preferable classification performance of discrimination and robustness.Copy detection performance is also tested and the results verify effectiveness of our robust hashing.The rest of this article is composed of five parts as below.Details of the proposed robust hashing method are described in Section II.Experimental results of our robust hashing method are discussed in Section III.Performance comparisons with some existing robust hashing methods are presented in Section IV.Section V introduces our application to copy detection.Section VI sums up this article.

II. PROPOSED ROBUST HASHING METHOD
Our proposed robust hashing method includes four phases.The framework of the proposed robust hashing method is shown in Fig. 1.First, the weighted visual map is constructed by the Itti's saliency map and the DOG's edge map.Second, the blockbased DCT features are extracted from the weighted visual map.Third, the LTSA technique is used to learn discriminative features from the block-based DCT features.In the end, the compact features are quantified and encrypted into a binary sequence.Each of these four phases is explained in the following sections.

A. Weighted Visual Map Construction
The weighted visual map is based on the Itti model and the DOG.First, the saliency map is calculated by the Itti model.Second, the edge map is detected by DOG.Finally, the weighted visual map is produced by combining the saliency map and the edge map.Since the weighted visual map can indicate visual attention regions of image, hash calculation via the weighted visual map can guarantee robustness of the proposed method.
1) Saliency Map Calculation: Visual attention model can recognize the focused areas of human visual system (HVS).To improve hashing robustness, the Itti model [43] is used to obtain saliency map.The Itti model simulates the neural mechanism of human vision in computational structure and has been employed in image quality assessment [44], image retrieval [45], etc. Steps of the Itti model are briefly described below.
First, the luminance saliency map S L is determined by a Gaussian pyramid with different scales.Second, the color saliency map S C is obtained by fusing the color features with different operations.Third, the directional saliency map S D is calculated by Gabor pyramids with different directions.Eventually, the final map S can be produced by using the luminance, color and directional saliency maps as follows.
More details of the calculations can be found in [43].
2) Edge Map Detection: Image edge is the boundary of the area where the gray level changes sharply.It is an effective method for distinguishing images by HVS and has been applied to image diffusion, object identification and image reconstruction.To enhance robustness, DOG is used to generate an edge map.The DOG has strong robustness against image noise, scale transformation, and affine attack.It has been applied to image quality assessment [46] and image retrieval [47].Specific process of the DOG is as below.
First, the Gaussian function is defined as follows.
where σ is the frequency band of the Gaussian filter.Second, the image F is convoluted by the Gaussian function with two parameter values σ 1 and σ 2 to acquire the following two images.
where G σ 1 and G σ 2 are the Gaussian functions using the parameters σ 1 and σ 2 , respectively.Finally, the DOG edge map D is obtained by subtracting these two images.
3) Weighted Map Generation: The weighted map is determined by the saliency map and the edge map to guarantee robustness of the proposed method.To do this, input image is first converted to the size of M × M .Second, the Itti model is employed to find the saliency map S of the resized image.Third, the luminance component of the resized image in the YCbCr space is taken to generate the edge map D by DOG.Finally, S and D are used to construct the weighted map C as follows.

B. Block-Based DCT Feature Extraction
To maintain discriminative features and improve computational efficiency, DCT features of the weighted visual map are extracted.Specifically, the weighted visual map is divided into non-overlapping blocks with the size of m × m.Thus, the block number is N = ( M m ) 2 .All non-overlapping image blocks are numbered according to their positions in the weighted visual map, that is, from left to right and from top to bottom.Let B i be the i-th block (1 ≤ i ≤ N ), and B i (j, l) be the element of B i in the (j+1)th row and (l+1)th column.Thus, two-dimensional DCT is applied to B i .For the DCT result of B i , the coefficients in the first row/column are selected as features.
The coefficients in the first row of the DCT result of B i are calculated as below.
Likewise, the coefficients in the first column of the DCT result of B i are computed as follows.
For the K i (0, v) sequence, the elements from the 2nd position to the (β+1)th position are selected to form the vector r (1) i with β elements as follows.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

All block vectors r
(1) i (1 ≤ i ≤ N ) are arranged to form a matrix as follows.
2 , .., r (1) N (10) Similarly, for the K i (u, 0) sequence, the 2nd element to the (β+1)th element are selected to make up the vector r (2) i with β elements as follows.
All block vectors r i (1 ≤ i ≤ N ) are arranged as follows.
2 , .., r Here, the DCT coefficients in the first row/column are exploited to construct feature matrices.This is because they can effectively represent pixel changes of the block.Actually, these coefficients have been successfully used in some state-of-the-art hashing methods, such as [21] and [42].In the Sections IV-A and V-B, comparison results will show that the proposed method is better than the GLCM-DCT method [42] in classification and copy detection.
Next, these two feature matrices are cascaded to generate a feature matrix sized η × N (η = 2β) as follows.
in which z j (i) and p j (i) are the ith elements of z j and p j (1 ≤ i ≤ N ), and ς j and μ j are the standard deviation and mean of z j computed as below.

C. Feature Learning With LTSA
Manifold learning can find low-dimensional manifold structures from high-dimensional space to achieve dimensional reduction.The LTSA [48] is an efficient technique of manifold learning.Since it can maintain the local geometric relationships within image, the learned features can guarantee discrimination of the proposed method.Currently, LTSA has been employed in hyperspectral image classification [49], image retrieval [50], image recognition [51], and so on.In this work, we explore LTSA to extract compact features from the feature matrix of the weighted visual map in DCT domain.
Suppose that x i is a data point in ξ-dimensional space and there are n data points in total.Therefore, the LTSA can learn compact features by the below steps.
1) Construct the local neighborhood matrix In the high-dimensional space, k-NN method is used to search the nearest neighbor of each data point x i , and k nearest neighbor points of each data point x i (i = 1, 2, . . ., n) are determined by euclidean distance to form the neighborhood matrix X i = [x i1 , x i2 , . . ., x ij ] (j = 1, 2, . . ., k), where x ij is the euclidean distance between x i and its jth nearest neighbor point.
2) Calculate local tangent space coordinates The local tangent space coordinates of x i is defined as where the jth coordinate value θ in which x i is the mean of all distances x ij , and Q i is the k × d matrix formed by the d left singular vectors of the matrix X i (I − 1 k ee T ), where I is the unit matrix, and e is the unit vector.Here, the singular vectors can be obtained by SVD.

3) Generate global coordinates
The local tangent space coordinates of x i is mapped to the global coordinates y i = [y 1i , y 2i , . . ., y di ] T .For simplicity, the global coordinates of all data points are represented by the below matrix.
In general, the matrix Y can be obtained by picking d eigenvectors of the matrix U, corresponding to the minimum eigenvalues from the 2nd eigenvalue to the (d+1)th eigenvalue.Specifically, the row vectors of the matrix Y are composed of these d eigenvectors of U, which is defined as below. where ment is 1 and the rest elements are all 0, and where ∅ + i is the Moor-Penrose generalized inverse of ∅ i .More details of the LTSA can be found in [48].
In this work, the block-based DCT feature matrix with the size of η × N is input to the LTSA algorithm.Note that each vector is regarded as a data point in the η-dimensional space, just as x i in LTSA, and there are N vectors in total.After implementing the LTSA, the compact features are then extracted.Here, the compact features are the output of the LTSA, i.e., the low dimensional matrix Y sized d × N .

D. Quantization and Encryption
First, the variance δ 2 i of the d-dimensional vector y i is computed as below.
in which y i (l) is the lth element of y i , and μ i is be calculated as below.
Second, the mean δ μ of the variances is computed and the variance is quantified as below.
Third, to generate secure hash code, a pseudorandom generator controlled by secret key k 1 is adopted to create N random numbers L(i), which are transformed to a random binary sequence as below.
in which L m is the median value of the N random numbers.Next, the XOR operation is used to conduct encryption as follows.
in which ⊕ is the XOR operation.Eventually, the hash code of the proposed method is determined as below.
Obviously, our hash code is a sequence of N bits.Note that the use of LTSA here is different from that of [17].In this work, we exploit LTSA to design a novel robust hashing for image copy detection, while the reference [17] uses LTSA to construct a robust video fingerprinting for video copy detection.There are significant differences between our work and the reference [17] in terms of the use of LTSA.First, the constructed feature matrices input to the LTSA are different.The reference [17] chooses the largest 100 DCT coefficients of the kurtosis image of each frame as a vector, and uses the vectors of all frames to construct a video feature matrix.In this work, we use the DOG and Itti model to construct the weighted visual map, divide the weighted visual map into blocks, apply two-dimensional DCT to these blocks, and exploit the DCT coefficients in the first row/column to construct the block-based image feature matrix.Second, the uses of the low-dimensional vectors of the LTSA are also different.The reference [17] selects the Frobenius norm between each pair of vectors in the three-dimensional space to generate a bit of video fingerprint, while we exploit the variance of each d-dimensional vector to construct the binary element of image hash.Our novel use of LTSA provides good classification performance of the proposed robust hashing method, which will be validated in Section IV-A.

E. Pseudo-Code Description
The above sections illustrate that our robust hashing consists of four components: weighted visual map construction, blockbased DCT feature extraction, feature learning with LTSA, and quantization and encryption.To improve readability of the proposed robust hashing method, the pseudo-code is described in Algorithm 1. and obtain N = (M/m) 2 blocks in total.6: Apply two-dimensional DCT to every block, select β coefficients in the first row/column as features, and obtain the DCT feature matrix Z by the (13).7: The matrix Z is normalized to the matrix P by the (14).8: Apply LTSA to P with the parameters k and d, and then the matrix Y is generated.9: Extract N variances δ 2 i (1 ≤ i ≤ N ) from the low dimensional matrix Y by the (21).10: Quantify δ 2 i to get a binary sequence J(i) (1 ≤ i ≤ N ) by the (23).11: Generate a binary sequence V (i) (1 ≤ i ≤ N ) with a pseudorandom generator using the key k 1 by the (24).12: Conduct encryption by h(i) = J(i) ⊕ V (i).13: The hash code h is determined by the (26).14: return h.

F. Similarity Analysis
To analyze the similarity of two hash codes, the Hamming distance is employed.Suppose that h 1 and h 2 are two hash codes.Then, the Hamming distance is computed as below.
in which h 1 ( ) is the th element of h 1 and h 2 ( ) is the th element of h 2 .The bigger the Hamming distance is, the less similar their corresponding images are.In practice, a threshold is generally used to determine the similarity of two images.

III. EXPERIMENTAL RESULTS
Parameters of our robust hashing are set as follows.The image size is 512 × 512, the convolution mask of DOG is 9 × 9, σ 1 = 0.8, σ 2 = 0.1, the block size is 64 × 64, the number of neighborhood points is 60 and the dimension selection is 30.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE II OPERATIONS AND PARAMETERS FOR SIMILAR IMAGE CONSTRUCTION
That is, M = 512, m = 64, k = 60, β = 32 and d = 30.Thus, the length of our hash code is N = 64 bits.Our robust hashing is coded by using MATLAB R2018b.The adopted computer has a CPU of Intel i7 dual-core 8700 with the main frequency of 3.20 GHz and the memory size is 8 GB.The following parts present various experimental results of our robust hashing.Specifically, robustness and discrimination are tested in Sections III-A and III-B, respectively.The dimension selection of LSTA and the convolution mask of DOG are discussed in Sections III-C and III-D, respectively.

A. Robustness Test
The Kodak database [52] is chosen to validate the robustness.This database has 24 color images and Fig. 2 shows some of them.The selected operations and their parameters for similar image generation are enumerated in Table II.For every image of the Kodak database, 74 similar versions can be created by using these operations.Hence, 24 × 74 = 1776 pairs of similar versions are generated.In summary, 1776 + 24 = 1800 images are utilized in this experiment.
The Hamming distance between hash codes of every pair of similar images is calculated.Table III demonstrates the statistical results of these distances, which contain the largest distance, the smallest distance, the mean and the standard deviation.From Table III, the mean values of all operations are less than 3, except the combination attack of rotation, cropping and re-scaling.As the combination attack introduce more distortions on the attacked images than a single attack, its mean Hamming distance is larger than that of a single attack.The mean value of the combination attack is 10.1708, which is larger than those of single attacks but is still a small value.The calculation results illustrate that when the threshold is 10, the proposed method can correctly identify 92.79% similar images.When the threshold is selected as 18, the proposed method can reach a correct detection rate of 99.38%.Therefore, our robust hashing has enough robustness according to the high correct detection rate.

B. Discrimination Test
The VOC2007 database is employed to check the discrimination.There are 5011 color images in VOC2007 database and Fig. 3 lists some of them.Hamming distances between hash codes of each pair of images in the database are computed.Then, the total number of distances reaches 5011×(5011 − 1)/2 = 12552555.Fig. 4 is the distribution of these distances.The calculation results show that the smallest distance and the largest distance are 0 and 62, and the standard deviation and the mean are 6.3245 and 30.7608, respectively.The mean of hash codes of different images is 30.7608, which is larger than the largest mean of similar images (10.1708).This illustrates that our robust hashing method is discriminative.
It is noticed that discrimination and robustness are both related to the chosen threshold.Table IV demonstrates their detailed

TABLE IV DETECTION PERFORMANCES UNDER VARIOUS THRESHOLDS
performances under different thresholds.In Table IV, discrimination is expressed by false detection rate and robustness is represented by correct detection rate.Clearly, when the threshold is 16, the total error rate is minimized.In this case, our robust hashing method achieves the best performance in the viewpoint of balancing discrimination and robustness.Hence, the threshold 16 can be chosen as the recommended value.Certainly, a suitable threshold can be determined according to the demand of the actual application.

C. Selection of Dimension
To view our performances under different dimension selections, Receiver Operating Characteristics (ROC) graph [54] is used to carry out experimental analysis.In this graph, the ordinate and the abscissa are represented by P 1 and P 2 as follows.
# similar images rightly detected # similar images (28) The definitions of the ( 28) and ( 29) illustrate that P 2 is the quantitative metric of discrimination and P 1 is the quantitative metric of robustness.A small P 2 value implies a good discrimination, while a large P 1 value means a good robustness.A set  points coordinates (P 1 , P 2 ) are used to plot a curve of ROC graph.If there are two curves in a graph, the curve near the upper-left corner has better performance than that away from it.In addition, the area under the ROC curve (AUC) is computed for performance comparison, where the scope of AUC is [0, 1].The curve with a big AUC is better than the curve with a small one.
In this section, the used datasets are consistent with the image libraries of Section III-A and III-B.Different dimension selections of LTSA in feature learning are discussed, i.e., different d values.Specifically, the d value is chosen from {10, 20, 30, 40, 50}, and other parameter settings remain the same.Fig. 5 demonstrates the curves under different d values.The results demonstrate that the AUC values of the dimension values of 10, 20, 30, 40 and 50 are 0.99808, 0.99936, 0.99955, 0.99931 and 0.99782, respectively.Obviously, the AUC value of dimension value 30 is larger than those of other dimension values.Moreover, the running time of the dimension values 10, 20, 30, 40 and 50 is 0.1684, 0.1696, 0.1716, 0.1721 and 0.1753 seconds, respectively.Obviously, there is slight difference in running time.Performances under different d values are listed in Table V.Therefore, our proposed robust hashing reaches preferable performance when d value is 30.

D. Selection of Convolution Mask Size
In the construction of the weighted visual map, the DOG technique is used, where the size of the convolution mask is a key parameter.This section discusses the performance of the proposed robust hashing methods under different convolution mask sizes.The used convolution mask sizes are 3 × 3, 5 × 5, 7 × 7, 9 × 9 and 11 × 11.Fig. 6 demonstrates the ROC curves under Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.different convolution mask sizes.The AUCs of 3 × 3, 5 × 5, 7 × 7, 9 × 9 and 11 × 11 are 0.99937, 0.99950, 0.99946, 0.99955, 0.99950, respectively.When the convolution mask size is 9 × 9, the AUC reaches the maximum value.This illustrates that convolution mask size 9 × 9 is better than other convolution mask sizes in our hashing method.The running time of the convolution mask sizes of 3 × 3, 5 × 5, 7 × 7, 9 × 9 and 11 × 11 is 0.1718, 0.1734, 0.1719, 0.1716 and 0.1747 seconds, respectively.Performance summary of different convolution mask sizes is shown in Table VI.Obviously, compared with other convolution mask sizes, the convolution mask size 9 × 9 can provide our robust hashing method with a better whole performance.

IV. PERFORMANCE COMPARISON
To demonstrate advantages, some popular robust hashing methods are employed for comparison, including SVD-CSLBP method [27], RW method [28], MDS method [34], HCVA-DCT method [37], LRR-RP method [38] and GLCM-DCT method [42].The papers of these robust hashing methods are published in famous periodic international journals or conferences.Moreover, SVD-CSLBP method and MDS method also use the techniques of dimension reduction, such as SVD and MDS.To ensure a fair comparison, the reported parameter values and the metrics of hash similarity in the papers of these methods are all used here, and all images are resized to 512 × 512 before they are input to these methods.In the below sections, Section IV-A presents classification performance, and Section IV-B compares the performances of time and storage.

A. Classification Performance
The image databases described in Section III are used to evaluate hashing classification performance.Specifically, 1800 images are used for robustness and 5011 images are taken for discrimination.The ROC graph is still selected to conduct visual comparison.In Fig. 7, all evaluated methods' curves are drawn in the same graph for easy comparison.For a better view of these curves, their local details are enlarged and drawn in the same graph.Obviously, the curve of our robust hashing method is more near to the upper-left corner than those of the compared methods.The visual comparison demonstrates that the classification performance of our robust hashing method outperforms those of the compared methods.Moreover, the quantitative metric called AUC is also calculated.The experimental results demonstrate that the AUC of our robust hashing method is 0.99955.However, the AUCs of SVD-CSLBP method, RW method, MDS method, HCVA-DCT method, LRR-RP method and GLCM-DCT method are 0.86697, 0.97322, 0.99010, 0.97301, 0.99228 and 0.99605 respectively.Clearly, the AUC of our robust hashing method is greater than those of the compared methods.AUC comparison also demonstrates that our robust hashing method is superior to all compared methods.Our proposed robust hashing method achieves the competitive advantage of classification performance due to the following reasons.The weighted visual map based on DOG and Itti model can indicate visual attention regions.Hence, hash calculation with the weighted visual map can guarantee robustness.In addition, LSTA can maintain the local geometric relationships within image.Therefore, the compact features learned with LTSA can ensure discrimination.

B. Time and Storage Performances
Time performance is judged by the computing time of creating a hash code.The results demonstrate that our proposed robust hashing method has a computing time of 0.1716 seconds.The computing time of the SVD-CSLBP method, the RW method, the MDS method, the HCVA-DCT method, LRR-RP method and the GLCM-DCT method is 0.1274, 0.0683, 0.4288, 0.0329, 28.357 and 0.1211 seconds, respectively.Obviously, our proposed robust hashing method is quicker than the MDS method and LRR-RP method, but our proposed robust hashing method is slower than the other compared methods.It is particularly worth noting that the main techniques of our proposed robust hashing method, the SVD-CSLBP method and the MDS method are LTSA, SVD, and MDS which are all dimension reduction techniques.Clearly, our proposed robust hashing method is competitive in time performance among dimension reduction-based hashing methods.
Storage performance is determined by the required bits of a hash code.Our proposed robust hashing method produces a hash code with a bit length of 64.The bit lengths of a hash code calculated by the RW method, the MDS method, the LRR-RP method and the GLCM-DCT method are 144, 720, 384 and 720, respectively.For the SVD-CSLBP method and the HCVA-DCT method, their lengths are 64 and 20 floating-point numbers.Note that a floating-point number requires 32 bits based on IEEE standard.Thus, the SVD-CSLBP method and the HCVA method need 2048 and 640 bits for saving a hash code, respectively.Storage comparison shows that our proposed robust hashing method reaches the minimum storage cost.Table VII lists the time and storage performances of the evaluated methods.

V. APPLICATION TO COPY DETECTION
With the wide applications of digital images, people pay much attention to DRM and some useful techniques for DRM are in demand.Image copy detection is a significant task of DRM and it can be used for protecting image copyright.In practice, image copies are often generated by digital normal operations (e.g., compression, contrast/ brightness adjustment, and scaling) or inserting copyright logo/text.Visual contents of image copies are almost the same with those of their original images.Therefore, image copy detection [7], [15], [20] is to efficiently find such similar images of a given query image.Due to the hashing benefit of robustness and low storage, many researchers have exploited robust hashing to conduct copy detection.
In this section, we validate copy detection performance of our proposed robust hashing method.Specifically, Section V-A describes the used database and metric and Section V-B shows the results of copy detection.

A. Database and Metric
To construct an image database for copy detection, the UCID [55] is used.There are 1338 color images in the UCID with the size of 512 × 384 or 384 × 512.In the experiment, 48 images are randomly chosen as the query images.Fig. 8 shows the thumbnails of these query images.× 90 and it is placed in the bottom right corner) and image deformation with rotation (IDR: angle: 30 • ).So there are 720 image copies.These image copies and the images of UCID excluding the above selected 48 images are used to form the copy image database.Therefore, the total image number in the database is 720 + 1338 − 48 = 2010.For every query image, there are 15 image copies and 1995 different images.
To view the image copies produced by the 15 digital operations, typical examples are presented in Fig. 9, where (a) is an original image of the UCID, and (b)-(p) are the 15 image copies of (a).To show the effect of these digital operations, the Hamming distances between hashes of the original image and its copies are calculated by our method.Consequently, Fig. 9  To confirm the copy detection performance of different methods, the mean average precision (MAP) is adopted.MAP is determined by the Average Precision (AP).The calculation of AP is related to the order of the returned images of different methods.The formula of AP is as below: where f i = 1 if the ith returned image is an image copy.Otherwise, f i = 0.The MAP can be obtained by calculating the average of the APs of all query images.The scope of MAP is [0, 1].Generally, a greater MAP indicates a better copy detection performance.

B. Detection Results
To show advantage, the copy detection performance of our proposed robust hashing method is also compared with the SVD-CSLBP method [27], RW method [28], MDS method [34], HCVA-DCT method [37], LRR-RP method [38] and GLCM-DCT method [42].Note that the SVD-CSLBP method and the MDS method are also dimension reduction-based methods and the HCVA-DCT method and the GLCM-DCT method are DCT-based methods.The MAPs of these hashing methods are computed and presented in Fig. 10.It can be seen that the MAP of our proposed robust hashing method is 0.97588.The MAPs of the SVD-CSLBP method, the RW method, the MDS method, the HCVA-DCT method, LRR-RP method and the GLCM-DCT method are 0.55938, 0.82294, 0.73250, 0.75722, 0.72306 and 0.91376 respectively.Clearly, the MAP of our proposed robust hashing method is bigger than the MAPs of the compared methods.Our proposed method makes better performance of copy detection than the compared methods.This is because our proposed method has better classification performance which can reduce classification error during copy detection.
A typical example of copy detection performance is demonstrated in Fig. 11 for visual comparison.In this example, the first image in the upper left corner of Fig. 8 is selected as the query image.The top 15 images are returned by each method.For page limitation, only the returned images ranked 10th to 15th are listed in Fig. 11, where the image in the green box is a different image.In other words, the images in the green box are wrongly returned.From Fig. 11, it can be observed that all methods returned images incorrectly detected except for our proposed method.Specifically, the SVD-CSLBP method returned five images, the RW method, the MDS method and the LRR-RP method returned two images, the GLCM-DCT method and the HCVA-DCT method only returned one image.This experiment validates that our proposed method has superior performance in copy detection.

VI. CONCLUSION
A novel robust hashing method via LTSA has been proposed for image copy detection.A significant contribution is the weighted visual map construction based on Itti model and DOG, which can provide good robustness of our proposed robust hashing method.Another key contribution is the feature learning via LTSA.As it can maintain the local geometric relationships within image, the learned compact features can guarantee discrimination of our proposed robust hashing method.Extensive experiments have been carried out.Comparisons using AUC metric have demonstrated that our proposed robust hashing method has preferable performance in classification.Copy detection results have shown effectiveness of our proposed robust hashing method.
Research on robust hashing for copy detection is still under way.Currently, most techniques of image copy detection can correctly recognize those image copies generated by inserting copyright logo/text, but they also mistakenly classify those local tampered images (e.g., face replacement) as image copies due to only small image region changed.In future research, we will try to address this limitation, develop robust hashing for video copy detection, and so on.
in which C(a, b), S(a, b) and D(a, b) are the elements of C, S and D in the ath row and bth column (1 ≤ a ≤ M , 1 ≤ b ≤ M ), respectively.

Algorithm 1 : 2 :
Our Robust Hashing With LSTA.Input: an image I, parameters: M , m, k, β, d, k 1 .Output: Hash code h.1: I is resized to M × M .Calculate saliency map S by the Itti model.3: Exploit DOG to compute the edge map D of the luminance component of the resized image in the YCbCr space.4: Construct the weighted map C by combining S and D using the (6).5: Divide the weighted map C into blocks of size m × m
(b)-(p) are presented in ascending order according to the Hamming distance d H .It is observed that CA, BA, GLF, JC and LE make Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

Fig. 9 .
Fig. 9. Original image and its 15 copies presented in ascending order according to the Hamming distance d H .

TABLE I CORE
TECHNIQUES OF SOME TYPICAL ROBUST HASHING METHODS with good balance between discrimination and robustness for detecting image copies.The prime contributions of this work are as below.
1) A weighted visual map is extracted for hash calculation.This visual map is based on the difference of Gaussian (DOG) and the visual attention model (VAM) named Itti model.Specifically, an edge map is first determined by using the DOG, and a saliency map is then produced by the Itti model.Finally, the weighted visual map is produced by combining the saliency map and the edge map.Since the weighted visual map can indicate visual attention regions of image, it can provide the proposed hashing method with

TABLE III STATISTICAL
RESULTS OF HAMMING DISTANCES BASED ON THE KODAK DATABASE Fig. 3. Some images of the VOC2007 database.

TABLE V PERFORMANCE
UNDER DIFFERENT DIMENSIONS

TABLE VI PERFORMANCE
UNDER DIFFERENT CONVOLUTION MASK SIZES

TABLE VII TIME
AND STORAGE PERFORMANCES OF DIFFERENT METHODS Table VIII lists the numbers of the used query images in the UCID.To simulate