Combining Statistical Features and Local Pattern Features for Texture Image Retrieval

The complementary fusion of global and local features can effectively improve the performance of image retrieval. This article proposes a new local texture descriptor, combined with statistical modeling in transform domain for texture image retrieval. The proposed local descriptor calculates the eight directions of the central pixel by using the relationship between the central pixel and the neighboring pixels in six directions, which is called the local eight direction pattern (LEDP). In the texture image retrieval system of this article, the feature extraction part combines global statistical features and local pattern features. Among them, both the relative magnitude (RM) sub-band coefficients and relative phase (RP) sub-band coefficients are modeled as wrapped Cauchy (WC) distribution in the dual-tree complex wavelet transform (DTCWT) domain, and the global statistical features employ the parameters of this model; while the local pattern features respectively choose the local binary pattern (LBP) histogram features in the spatial domain and the LEDP histogram features of each direction sub-band in the DTCWT domain. On the other hand, the similarity measurement selects matching distances for different features and combines them in the form of convex linear optimization. Texture image retrieval experiments are conducted in the Corel-1k database (DB1), Brodatz texture database (DB2) and MIT VisTex texture database (DB3), respectively. Experimental results show that, compared with the best existing methods, the approach proposed in this article has achieved better retrieval performance.


I. INTRODUCTION
With the development of multimedia technology and the arrival of the digital age, the number of digital images in the Internet database increases exponentially. How to search the required images from various databases becomes an urgent problem. Content-based image retrieval (CBIR) is a technique that uses features which can represent image content to search for images which are similar to query images. Usually, CBIR extracts features from both database images and query images, and selects the image that best matches the query image [1]. In other words, CBIR system mainly includes two essential parts: feature extraction and similarity measurement. The effectiveness of the former mainly depends on how to extract features and what types of features The associate editor coordinating the review of this manuscript and approving it for publication was Wenming Cao . are extracted. The accuracy of the latter is determined by which form of similarity measurement is selected. Both affect the performance of retrieval at the same time. Therefore, how to find a useful image retrieval method has become a hot topic in current research. In the literatures [2]- [4], the researchers made a comprehensive and detailed literature review on CBIR.
The texture is a primary visual feature of the image, and texture image represents a large class of natural images. Texture image analysis has been widely used in image retrieval [5], human identification [6], segmentation of remote sensing images [7], defect classification [8], etc. Texture features, shape features and color features are all important features of texture analysis, which can be applied in different fields, especially in the field of image processing. However, when applied to different image processing tasks, the selected features are different [9], [10]. Texture features are significant VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ features in image retrieval. Compared with shape and color features, texture features can obtain more image feature information and be effectively extracted in spatial and transform domains. In the transform domain, the statistical features based on wavelet transform have become one of the essential features to describe texture image because wavelet analysis is very consistent with human visual perception. Do et al. [11] proposed generalized Gaussian distribution (GGD) statistical features based on detail sub-band coefficients in the discrete wavelet transform (DWT) domain. Kwitt et al. [12] proposed the statistical features of Gamma distribution and Weibull distribution of magnitude sub-band coefficients based on DTCWT; Oulhaj et al. [13] proposed GGD statistical features of RM sub-band coefficients and Gaussian mixture statistical features of RP sub-band coefficients based on complex wavelet transform [14]. Vo et al. [15] used the GGD statistical feature of real and imaginary sub-band coefficients and the Vonn distribution statistical features of RP sub-band coefficients in uniform discrete curvelet complex transform domain. On the other hand, the local pattern feature is a critical feature of texture image in the spatial domain. It is widely used in the classification and retrieval of natural and texture images. Ojala et al. [16] proposed LBP and applied it to texture image classification. Zhang et al. [17] proposed the local derivative pattern (LDP) for face recognition.
Tang et al. [18] further extended LBP to face recognition under different illumination conditions. Tajeripour et al. [19] proposed a modified LBP for stone porosity computing.
Verma et al. [20] proposed a local tri-directional pattern (LTriDP) for texture image retrieval. Pan et al. [21] proposed local vector quantization pattern(LVQP) for texture classification. Gupta et al. [22] proposed color texture image retrieval based on LBP and local extrema peak valley pattern (LDPVP), combined with local directional peak valley pattern (LDPEVP) and color feature. Chakraborty et al. [23], [24] proposed local gradual hexagon pattern and R-theta local neighborhood pattern for facial image recognition and retrieval. Dubey et al. [25] proposed local bit-plane decoded pattern for biomedical image retrieval. Verma et al. [26] combined LBP with local neighborhood difference pattern (LNDP) for natural and texture image retrieval. Liu et al. [27], [28] proposed binary rotation invariant and noise tolerant(BRINT) and median robust extended local binary pattern(MRELBP) for texture classification, and these two approaches are robust to noise and illumination, etc. In these methods, the global features of the image captured in the transform domain, lack the description of the local information, while the local features applied to the spatial domain, lack the use of the transform domain information. To further explore the application of local pattern features in the transform domain, Qian et al. [29] extended the conventional local binary pattern to pyramid transform domain (PLBP) for texture classification. Akoushideh et al. [30] further used the multi-level, multi-resolution approach and multi-band (ML + MR + MB) approach on the basis of PLBP to increase the accuracy of classification. Murala et al. [31] proposed a local tetra pattern (LTrP) based on the Gabor wavelet transform domain, which produced four directional features and one magnitude feature, and effectively improved the performance of texture image retrieval. Lei et al. [32] used the Gabor wavelet transform to verify that image information in the joint spatial domain and transform domain can provide more useful features. In recent years, because only one feature can not fully describe the image information, multi-feature fusion has become an effective method for texture image classification and retrieval. Yang et al. [33] fused statistical features and LBP features based on the DTCWT domain for texture image classification. Ershad et al. [34] fused the gray level cooccurrence matrix (GLCM) and the LBP feature for texture classification, and applied K-L divergence and hybrid color LBP to color image classification [35]. Kumar et al. [36] fused LBP features in the spatial domain and mean and variance features in the contourlet transform domain for texture image retrieval, and chi-square distance was used for similarity measurement. Zhou et al. [37] proposed to fuse the features of the color histogram, local direction pattern and dense SIFT based on bag of feature (BoF), and use the diffusion process to optimize the global matching of the fused multi-feature images, and use L1 distance as a similarity measurement. Nazir et al. [38] proposed a fusion method based on color and shape features, where the color moment and the color histogram are selected to represent color features, and invariant moments are adopted for shape features. Naghashi et al. [39] combined two spatial features for fusion, in which the LTP was firstly calculated in the image spatial domain, and then the feature vector was obtained by applying the GLCM in the LTP. In [39], the combination of the above two methods retains the strong robustness of LTP and extracts the spatial correlation between adjacent pixels and the spatial information and frequency of adjacent local patterns through GLCM, and L1 distance is also used as a similarity measurement. Because these methods fuse multiple features, the image information can be fully described, and the performance of texture image retrieval or classification can be effectively improved.
Our motivation is to find a local descriptor in the multi-scale and multi-directional transform domain, which is multi-directional and can make full use of the directional information in the sub-band coefficients. In addition, we hope to select the same statistical model for different sub-band coefficient modeling in the transform domain, and at the same time, find a method that can effectively combine multiple features for texture image retrieval. For these reasons, in this article a new texture image retrieval method which fuses global statistical features and local pattern features is proposed based on the spatial domain and DTCWT domain.  To sum up, the main contribution of this article is threefold: i) a new local descriptor LEDP is proposed to make better use of the directional sub-band information in DTCWT domain; ii) the same statistical model is used for the RM sub-band coefficient and the RP sub-band coefficient to ensure the unity of the modeling form; iii) an optimized combination of similarity measurements is used to realize better matching between the extracted features and the corresponding similarity measurements.
The rest of this article is organized as follows. Section II introduces the related theory and the detailed calculation steps of the LEDP. Section III presents in detail the features and similarity measures used in this article. Experimental results and discussions are given in Section IV. Section V concludes with the summary of this article and indicates some possible future work

II. RELATED THEORY A. DTCWT
DTCWT is a multi-scale and multi-directional image complex transform method proposed by Kingsbury [40]. It has two obvious advantages for texture image analysis: good translation invariance and more direction selectivity. In general, six high-frequency complex directional sub-bands (±15 • , ±45 • , ±75 • ) and two low-frequency approximate bands can be obtained by 2-D DTCWT decomposition. Fig.1 illustrates the base image of six directional sub-bands in DTCWT domain. The first line represents the real part direction of the complex wavelet, and the second line represents the imaginary part direction of the complex wavelet.
After DTCWT decomposes the image, the complex subband coefficients at different scales can be obtained. Since the complex value sub-band coefficient has two parts: the real part coefficient and the imaginary part coefficient, the phase sub-band coefficient p and the magnitude sub-band coefficient m at the spatial position (i, j) can be obtained by the real part coefficient and the imaginary part coefficient. Through the phase sub-band coefficient p and the magnitude sub-band coefficient m, RP sub-band coefficients and RM sub-band coefficients, which are easier to distinguish texture information, can be obtained, and at the spatial position (i, j) in the complex sub-band they are defined as follows where RP (i, j) and RM (i, j) represent the RP sub-band coefficient and RM sub-band coefficient at the position (i, j), respectively; p s,k and m s,k represent the phase and magnitude coefficients in the kth direction sub-band at scale s, respectively. The phase sub-band coefficient histogram is uniform and will not produce any distinguishable information about the image, while the RP sub-band coefficient histogram will produce information to distinguish different images [15]. The magnitude sub-band coefficient is always greater than zero, which limits the range of histogram fitting of the selected statistical model. Therefore, RM sub-band coefficient is selected to increase the optional range of statistical model.

B. WRAPPED CAUCHY DISTRIBUTION
The definition of probability density function of wrapped Cauchy distribution can be expressed by (3) as [41] p (θ; ρ, µ) = 1 2π where θ ∈ [−π, π]; µ ∈ [−π, π] is the position parameter; ρ ∈ [0, 1) is the scale parameter, and the larger the value, the sharper the corresponding probability density curve. The model parameters can be estimated by the maximum likelihood method.

C. LBP
Ojala et al. [16] proposed LBP for texture image analysis. Given one center pixel of the image, the LBP value can be obtained by comparing the gray value of the center pixel and the gray value of the neighboring pixels. LBP with radius R and neighborhood P is defined as follows where g c is the gray value of the central pixel and g p is the gray value of the neighboring pixels. VOLUME 8, 2020

D. LEDP
The LTrP proposed by Murala et al. [31] is based on the direction of the pixel to effectively capture the local information of the image. In the spatial domain, firstly, the first derivative of 0 • and 90 • directions is calculated to get the central pixel direction, and then the LTrP value in the spatial domain is calculated according to the central pixel direction. Meanwhile, The LTrP based on the Gabor transform is also proposed in reference [31], and it uses the 0 • and 90 • real subband coefficients in the Gabor transform domain to calculate the direction of the center pixel of the sub-band image, thus obtaining the LTrP value in the transform domain. According to Section II.A, each layer in the DTCWT domain generates six complex sub-bands, and the real and imaginary parts of each sub-band have the same direction. In this article, the LEDP based on the DTCWT domain is proposed, and the LEDP in the transform domain is calculated by using the coefficients of the real and imaginary parts of the six directional sub-bands (±15 • , ±45 • , ±75 • ). Precisely, given image I , the first-order derivatives along ±15 • , ±45 • and ±75 • directions are denoted as denote the neighborhood pixel values of the central pixel in the direction sub-band. Then, it can be written as and Dir ±15 • , Dir • ±45 and Dir ±75 • can be calculated by the following formula Finally, the obtained Dir ±15 • , Dir ±45 • ,and Dir ±75 • can be converted into the direction of the central pixel by (9) where p is the number of directional pairs. From the above formula, the direction of the central pixel may be 1, 2, . . . , 8. In this way, the whole image can be converted into eight values (directions). If the direction obtained by (9) is ''1'', then the secondorder LEDP 2 (g c ) of direction 1 can be defined by the following formula (10) where f 2 is a function to judge the relationship between the direction of the center pixel and the direction of the neighborhood pixel. If the direction of the center pixel is the same as that of the neighborhood pixel, the direction of the assigned neighborhood pixel is 0, and if they are different, the original direction will be maintained.
If the 8-bit local pattern values of the central pixel are obtained by (10) and (11), then the seven binary pattern values of direction 1 can be obtained by the following formula where φ = 2, 3, . . . , 8, f 3 is a function that converts neighborhood pixel direction to 0 or 1. Similarly, a total of 56(7×8) binary pattern values can be obtained for the other seven directions.
Guo et al. [42] proposed the magnitude LBP using the magnitude component of local difference operators and verified its effectiveness. According to their theory, this article puts forward MLEDP and PLEDP. Use the first derivative of the central pixel g c in the direction of ±15 • , ±45 • , ±75 • , they are as shown in the following formula.
222614 VOLUME 8, 2020 where M I 1 g p is the magnitude of the pixel, and P I 1 g p is the phase of the pixel. In order to reduce the computational complexity, this article selects the uniform patterns [43]. Therefore, for the neighborhood P = 8, the feature vector length of the LEDP in each direction is reduced to P(P−1)+3. For a given image I , m scale decompositions are performed in the DTCWT domain to obtain six complex directional subbands in each scale. Firstly, the first derivatives of the real and imaginary sub-band coefficients in each direction are calculated, and then the second-order LEDP(DTLEDP) at six scales can be obtained as follows The direction of the center pixel g c is calculated by replacing I 1 θ 1 (g c ) ⇒ DT 2 m,θ 1 (g c ) and I 1 θ 2 (g c ) ⇒ DT 2 m,θ 2 (g c ) in (8). Similarly, DTLEDP, MLEDP, and PLEDP values can be obtained by (8) ∼ (19). Finally, each layer will respectively produce 8 DTLEDP values for real part and imaginary part sub-band coefficients, and three direction pairs (±15 • , ±45 • , ±75 • ) in each layer will respectively produce three MLEDP values and three PLEDP values for real part and imaginary part sub-band coefficients. Therefore, the 28 (8+8+6+6) local pattern values obtained in each layer are converted into histograms as features.
In order to find the pixel values in the direction of  Fig. 3. Fig. 3 illustrates that, the yellow area is the neighborhood pixel of g c with R = 1, and the blue area is the neighborhood pixel of g c with R = 2. In this article, the neighborhood pixels of R = 1 are selected to calculate LEDP. With the increase of  the radius, the direction may also increase, that is, the number of features will increase. In the future work, we will continue to use the neighborhood pixels of R = 2 to calculate the thirdorder LEDP.

III. TEXTURE IMAGE RETRIEVAL METHOD
Texture image retrieval includes two essential parts: feature extraction and similarity measurement. The feature extraction part is used to select the features which can fully reflect the image information. The excellent features should reflect the global overview of the image and describe the local image information. The performance of retrieval can be effectively improved by fusing these two features. In this article, we choose the same statistical model (i.e., WC distribution) for RM and RP sub-band coefficient modeling respectively in the DTCWT domain, which can effectively ensure the uniformity of global modeling parameter features and corresponding similarity measurements. To compensate for the loss of local information due to using global features in the transform domain, we propose a new image local descriptor (i.e., LEDP) to extract local features in the DTCWT transform domain, and meanwhile the LBP local feature information in the spatial domain is also used. On the other hand, the matching similarity measurements are selected for the extracted features. K-L divergence is selected as the similarity measurement for the statistical model parameter features, and the relative L1 distance is used as the similarity measurement for both local descriptor features. VOLUME 8, 2020

A. SUB-BAND COEFFICIENT MODELING AND GLOBAL FEATURE SELECTION
The global features of texture image are determined by the statistical modeling parameters of directional sub-band coefficients in the DTCWT domain in this article. To unify the modeling form of DTCWT sub-band coefficients, both the RM sub-band coefficient and the RP sub-band coefficient select the WC distribution to model. In order to verify the rationality of the selected statistical model, and considering the limited space, two typical images are selected from each of the DB2 and DB3 databases for DTCWT decomposition. Then the sub-band coefficients of different directions under different scales are modeled. The typical experimental results are shown in Fig. 4. The first row is the images in each image database; The second row is WC distribution modeling and histogram fitting of RM sub-band coefficients with third scale and 15 • direction; The third row is WC distribution modeling and histogram fitting of RP sub-band coefficients with third scale and 15 • direction. In the parentheses at the bottom of the fitting figures, the first two numbers are the parameters of the model (ρ, µ ), and the last one is the objective evaluation index, namely entropy difference rate R e . R e is defined as the ratio of the K-L distance between histogram and model distribution or relative entropy ( H ) to entropy value (H ) of histogram distribution [15], and the smaller the value, the better the fit.  Fig.4 illustrates the fitting effect of each histogram and the size of R e , which can be seen that the selected statistical model can well fit the histogram of sub-band coefficients, that is, the coefficient modeling is suitable for RM and RP sub-bands. We observe that the model parameter values of different classes of texture images are different, therefore the differences between different texture images can be effectively distinguished by the model parameter features.
More importantly, this can alleviate the problem that the distance within the classes is greater than the distance between classes using different statistical models. To verify the advantages of this modeling scheme, the 631st and 640th images in the DB3 image database (both belong to the same class of image), and the 475th and 640th images (both belong to different classes of images) in the DB3 image database are selected to model and estimate the model parameters, and the K-L distance within the classes and the K-L distance between the classes are respectively calculated, and the results are shown in Table 1.
The data from Table 1 show that the distance within the class is larger than the distance between classes when the RP and RM sub-band coefficients are respectively modeled by the two models; while the distance within the class is less than the distance between classes when the RP and RM sub-band coefficients are respectively modeled by the same model. Therefore, the selection of the modeling scheme in this article can improve the performance of texture image retrieval.

B. LEDP AND LOCAL FEATURE SELECTION
The local features in this article are mainly obtained from the LEDP (also including MLEDP and PLEDP) histogram of directional sub-bands in the DTCWT domain. To verify that the proposed LEDP is multi-directional, we choose to compare it with the same multi-directional LTrP. Select the same test image and apply LTrP and LEDP respectively to obtain four directions of LTrP and eight directions of LEDP. The experimental results are shown in Fig. 5. Fig.5 illustrates that the selected test images have obvious multiple directionalities, which can be used to distinguish the ability of different local descriptors to describe the direction; compared with LTrP, LEDP can represent more detailed direction information. Therefore, LEDP is easy to capture more local direction feature information for texture images with rich direction information, and can better combine with DTCWT direction sub-bands, thus effectively improve the effect of texture image retrieval.

C. LEDP ROBUSTNESS
The LEDP proposed in this article has many nice properties. In order to verify these, the Tile.0001 image in DB3 database is selected for experiments. We tested the properties of the resistance against rotation, illumination, and scaling of LEDP respectively, and the experimental results are shown in Fig 6. Because there are many directional atlas produced by LEDP, limited to space, only part of the directional spectrum of LEDP (direction 1 and direction 8) are selected for testing. At the same time, because some of the properties can not be seen directly from the LEDP spectrum, the histograms of the eight directional spectrums generated by LEDP are merged.
Our experimental scheme is to verify the robustness of LEDP in the resistance against rotation, illumination, and scaling by comparing LEDP spectrums and LEDP histograms under different conditions. From Fig 6. (e)∼(i), we can see that the LEDP spectrum of the original image under different illumination is basically the same; when the original image is rotated by 90 • , the LEDP spectrum obtained is also close to the original image spectrum. From Fig 6 (m)∼(r), we can show that the LEDP proposed in this article has good robustness in the resistance against rotation, illumination, and scaling of the texture images.

D. SIMILARITY MEASUREMENT
In this article, K-L divergence is used to similarity measurement corresponding to the parameter features of the WC sta-tistical distribution model. However, because this similarity measurement does not have a closed-form, it can be estimated by the numerical method [15]. For each direction sub-band in DTCWT domain, the used K-L divergence is defined as D C (p (θ; ρ 1 , µ 1 ) , q (θ; ρ 2 , µ 2 )) = π −π p (θ; ρ 1 , µ 1 ) ln p (θ; ρ 1 , µ 1 ) q (θ; ρ 2 , µ 2 ) dθ (20) where p (θ; ρ 1 , µ 1 ) and q (θ; ρ 2 , µ 2 ) are respectively WC probability density functions of database candidate images and query images;ρ 1 , µ 1 and ρ 2 , µ 2 denote the scale and location parameters of the statistical model of database candidate image and query image, respectively. Correspondingly, the similarity measurement between each sub-band statistical model of RM and RP is expressed as D RM and D RP , respectively. The relative L1 distance is chosen as the similarity measurement corresponding to the histogram features of LBP and LEDP local descriptors. At the same time, the normalized Euclidean distance (L2) is selected as the comparative experiment for the similarity measurement. For each bin of the histogram, the relative L1 distance and L2 are respectively defined as where h l1,i and h l2,i denote the histogram features of the database candidate images and the query images, respectively, and i is the ith bin of the feature histogram, M is the standard deviation of all bin values. Correspondingly, for each bin, the similarity measurement of LBP is denoted as D LBP ; and for each sub-band and each bin, the similarity measurement of LEDP is expressed as D LTrP . Finally, the similarity measurements corresponding to various features are combined into a total similarity measurement in the form of a convex linear combination. It is defined as (23) where M and N are the numbers of directional sub-bands, K 1 and K 2 are the numbers of bins in histogram, and a + b + c + d = 1, 0 ≤ a, b, c, d ≤ 1.In the experiment, using the retrieval method proposed in this article, through MATLAB simulation, traversing the numerical range of a, b and c with the iteration step size 0.001, and according to the optimal average retrieval rate, a = 0.018, b = 0.072, c = 0.82, and d = 0.09 are set in the DB2 image database; a = 0.02, b = 0.08, c = 0.8, and d = 0.1 are set in the DB3 image database.

E. PROPOSED METHOD
The retrieval algorithm proposed in this article is shown as follows

IV. EXPERIMENTS AND DISCUSSION
In order to present and compare the performance of the method proposed in this article, three widely used texture image databases are selected for the experiments. The following comparative experiments are conducted to verify the effectiveness of this method by setting different database combinations.

A. EXPERIMENTAL DATABASES
All retrieval experiments in this article are carried out in three texture image databases. The first image database used images from the Corel database [45]. Some researchers think that the Corel database meets all the requirements to evaluate an image retrieval system due to its large size and heterogeneous content. In the experiment, 1000 images were collected and divided into ten categories to form the largesized database DB1. These images come from ten different types of domains, namely, Africans, beaches, buildings, buses, dinosaurs, elephants, flowers, horses, mountains, and food. Each category has 100 images with resolution of either 256 × 384 or 384 × 256. The second image database contains 116 different categories of texture images with resolution of 512 × 512, of which 109 texture images are from Brodatz texture image database [46] and seven texture images are from the University of Southern California image database [47]. Each image with resolution of 512 × 512 is divided into 16 non-overlapping sub-images with resolution of 128 × 128, thus creating the medium-sized image database DB2 with 116 categories, 16 images per category, and a total of 1856 (116 × 16) images. The third image database contains 40 different types of texture images with resolution of 512 × 512, all of which come from the MIT VisTex texture image database [48]. Each image with resolution of 512×512 is also divided into 16 non-overlapping sub-images with resolution of 128 × 128, thus creating the small-sized image database DB3 with a total of 640 (40 × 16) images, 40 categories and 16 images per category. Fig.7 illustrates sample images of three image databases. Besides, for DB2 and DB3 image databases, to reduce the correlation of gray values among similar sub-images and maintain the fairness of the retrieval process, all sub-category images are respectively normalized to zero mean and unit standard deviation.

B. PERFORMANCE EVALUATION INDEX
In the experiments, each image in the image database is treated as a query image. According to the extracted image VOLUME 8, 2020 features, the similarity measurement is calculated by (23), and finally, the top N images with the shortest distance from the query image are selected in the database. The performance of the retrieval system in this article is respectively evaluated by precision, recall and average retrieval rate (ARR), which are defined as where M is the total number of images in the database, S i is the number of correct images retrieved for the ith time, V is the number of images retrieved each time, and R is the number of images of each category.

C. COMPARISON OF RETRIEVAL PERFORMANCE BASED ON LTRP AND LEDP IN DB1 AND DB2
In order to compare the effect of two local direction descriptors, namely LTrP and LEDP proposed in this article, on the performance of texture image retrieval, the experiments are carried out in two distinct image databases DB1 and DB2.
The experimental results are shown in Fig.8 and Table 2. Fig.8 illustrates the comparisons of the precision and the recall between LEDP and LTrP based on the spatial domain and the transform domain in the ten categories of images from the DB1 image database. Given below are the abbreviations used in the analysis of the experimental results. LBP: LBP features [31] DTLBP: LBP in DTCWT LDP: Local derivative patterns [17] LTP: Local ternary patterns [18] LTrP: Local tetra patterns [31] GLTrP: Local tetra patterns with GT [31] LMEBP: Local maximum edge patterns [49] DTLTrP: LTrP in DTCWT LEDP: Local eight direction patterns 3-6 LDTLEDP: Local eight direction patterns with threesix layer decompositions in DTCWT 4 LDTLEDPL2: Local eight direction patterns with four layer decompositions in DTCWT with L2 distance Fig.8(a) illustrates that, in the spatial domain, the recall of LEDP in 3 out of 10 categories is better than that of LTrP for the DB1 image database; in the transform domain, the recall of 7 out of 10 categories of DTLEDP is better than that of DTLTrP. These results show that the local descriptor LEDP proposed in this article has more advantages in the DTCWT domain over LTrP. Fig.8(b) illustrates that, compared with DTLTrP, DTLEDP has similar advantages in precision, where 6 out of 10 categories of DTLEDP have better precision than DTLTrP. When the similarity measurement is L2, the precision and recall are lower than that of L1, which proves the rationality of the similarity measurement selected in this article. Besides, setting different decomposition layers in DTCWT domain also affects the retrieval performance of DTLEDP. When the number of decomposition layers is appropriately increased, the retrieval performance will be improved. However, when the number of decomposition layers is increased to some extent, the retrieval performance will decrease instead. For this reason, in order to ensure better retrieval performance, we selects 5-layer decompositions for DTLEDP in this article. From the data in Table.2, we can show that, the retrieval performances of LEDP and DTLEDP are superior to most existing local descriptors on image database DB2. Nevertheless, their retrieval performance is still lower, compared with that of LTrP. This result may be caused by the fact that some of the eight directions of LEDP are not sensitive to random information of a few images. Considering that the LEDP proposed in this article is mainly applied to the DTCWT domain, and in order to better utilize the direction information of the sub-bands, compared with DTLBP, GLTrP and DTLTrP, DTLEDP indeed has certain advantages. Therefore, DTLEDP can provide more local direction information than DTLTrP in DTCWT domain.

D. COMPARE RETRIEVAL PERFORMANCE OF DIFFERENT METHODS IN DB2 AND DB3
Select some existing methods which use image database DB2 and DB3 to compare the retrieval performance (ARR) with the method proposed in this article, and the experimental results are shown in Table 3. In Table 3, the ''GGD-WC'' and ''GGD-Vonn'' methods use the GGD distribution parameter features of the sub-band coefficients, and the WC distribution parameter features and the Vonn distribution parameter features of the RP subband coefficients in the uniformly discrete curvelet complex transform domain [15], [41]; the ''LBP'', ''LMEBP'' and ''LTriDP'' methods use the histogram features of LBP, LMEBP and LTriDP in spatial domain [20], [31], [49]; the ''SVD+LBP'' method combines the singular value decomposition (SVD) feature in the DTCWT domain with the LBP histogram feature in the spatial domain [50]; the ''CoALTP'' method fuses LTP and GLCM in spatial domain [39]; the ''LDPVBP-CH'' method fuses LDPVBP and CH in spatial domain [22]; the ''LNDP+LBP'' method fuses LNDP and LBP in spatial domain [26]; The "RM" method uses only the statistical features of WC distribution in DTCWT domain; the ''LBP+DP'' method uses LBP in spatial domain and DTLEDP in DTCWT domain; the ''LBP+RM'' method uses LBP in spatial domain and statistical features of WC distribution in DTCWT domain; the ''RM+DP'' method uses statistical features of WC distribution and DTLEDP in DTCWT domain. It should be noted that the texture images of the DB2 image database used by the ''LNDP+LBP'' and ''LTriDP'' methods are all from the Brodatz texture image database with a total of 2800 (112 × 25) texture images. In addition, ''PM'' is the method proposed in this article in Table 3. In our retrieval experiments, from the perspective of obtaining the best average retrieval rate, three layer decompositions in DTCWT domain are used for extracting the statistical modeling features; and five layer decompositions in DTCWT domain are used for extracting DTLEDP and MLEDP and PLEDP features; and the number of the bin is 59 for all local pattern histogram.
From the data in Table.3, it can be seen that the retrieval performance achieved by the method proposed in this article is better than the best results of existing approaches, which shows that our method is effective. Besides, compared with the ARR obtained respectively by the ''LBP+DP'', ''LBP+RM'' and ''RM+DP'' methods, the three feature fusion method exploited in this article substantially improve the ability to characterize texture images. Table 4 shows feature vector length and ARR in DB2 for PM and prior methods. We can see that from Table 4, although the feature vector length of PM is higher than that of other methods, its retrieval performance outperforms others in terms of ARR.

V. CONCLUSION
In image retrieval, the use of multi-feature fusion can effectively enhance the ability of feature representation of images, thus significantly improve retrieval performance. For this reason, a new method of texture image retrieval is proposed in this article based on the DTCWT domain and spatial domain, which combines global statistical features and local pattern features. Specifically, this method combines the LEDP features proposed in this article, LBP features and statistical features to characterize the texture information of the image. Accordingly, the similarity measurement uses a convex linear optimization form, which combines the similarity measurements corresponding to each feature. The results of the retrieval experiments in Corel (DB1), Brodatz (DB2) and VisTex (DB3) image databases verify the effectiveness and feasibility of our method; and compared with the best results of existing methods, the proposed approach has obvious advantages in the retrieval performance.
In this article, the LEDP is only applied to the DTCWT domain. In addition, it can also be applied to pyramid dual tree directional filter bank (PDTDFB) and Gabor complex transform domains with more flexible directionality. On the other hand, only relative magnitude sub-bands and relative phase sub-bands in DTCWT domain are modeled, but the low frequency sub-bands in the DTCWT domain are not exploited to model. In the future, we will continue to look for statistical models suitable for all DTCWT sub-bands to improve retrieval performance. Furthermore, because of the effectiveness of the method proposed in this article, it can also be applied to the retrieval of other image databases, such as facial image and biomedical image databases.
HUAIJING QU received the B.S. degree in electric engineering from the Shandong University of Technology, Jinan, China, in 1986, and the Ph.D. degree in signal and information processing from Shandong University, Jinan, in 2009. He is currently a full-time Associate Professor with the School of Information and Electric Engineering, Shandong Jianzhu University, Jinan. His research interests include multi-scale and multidirectional image processing, image retrieval, image fusion, and pattern recognition.
JIA XU was born in Heze, China, in 1996. He received the B.S. degree in communication engineering from Shandong Jianzhu University, in 2018, where he is currently pursuing the master's degree in control engineering. His current research interest includes image processing, especially on texture image retrieval.
JIWEI WANG received the B.S. degree in automation from the Anyang Institute of Technology, in 2018. He is currently pursuing the M.S. degree in control science and engineering with Shandong Jianzhu University. His current research interests include computer vision and image fusion.
YANAN WEI was born in Jining, China, in 1995. She received the B.S. degree in computer science and technology from the Qilu University of Technology, in 2019. She is currently pursuing the master's degree in control engineering with Shandong Jianzhu University. Her current research interest includes image processing, especially image fusion.
ZHISHENG ZHANG was born in Henan, China, in 1995. He received the B.S. degree from the Henan Institute of Engineering, in 2019. He is currently pursuing the master's degree in control science and engineering with Shandong Jianzhu University. His current research interest includes image retrieval.