Texture Smoothing Quality Assessment via Information Entropy

,


I. INTRODUCTION
Texture, as a set of texture elements or texels occurring in some regular or repetitive patterns, can be artificially created in an image or found in the captured natural scene images, such as mosaics, sand, and rocks. Image texture smoothing is commonly used to remove those redundant background textures that may be global (or local) details existing in an image. Unlike denoised images, the detextured image quality assessment needs to measure the loss of texture details while also counting the retention of structure features in relative areas. Since 1) the texture-free images as the ground truths are missing, and 2) a comprehensive consideration of texture smoothness and structure retention is required, it is challenging to effectively assess texture filtered image quality. So far, existing image quality assessment (IQA) metrics that can be usable for measuring detextured images are very rare. Designing an objective and quantitative evaluation index is not only conducive to the comparisons of texture filtering works, but also beneficial to many practical low-and high-level vision applications.
The associate editor coordinating the review of this manuscript and approving it for publication was Senthil Kumar.
For decades, User Study is widely used to evaluate de-textured image quality, which is required to design an independent user evaluation system and normalizes the viewer's score to perform a qualitative assessment of a filtered image. The intuitive and simple assessment of User Study without complex mathematical formulas is suitable to practical applications. However, when some filtered images are observed very similar by human perceptions, the intuitive results from User Study are difficult to achieve a reasonable evaluation. In order to fully comply with human visions, the human visual system (HSV) [1], [2] based on salient physiological and psychovisual features have been presented, which capture the statistics of important natural signals to match the result of the subjective evaluation. Similarly, Information Fidelity Criterion (IFC) [3] and Blind Image Quality Assessment (BIQA) [4] are also devised by using the statistically significant feature information. Due to the lack of real human perceptions, these metrics could not fully consider the unity of subjective textures and objective features of texture filtered images, and lead to a large deviations from subjective results.
Other metrics focus on structure details determined by human perceptions, such as subjective methods [5]- [7]. Edge Preservation Index (EPI) [5] based on regions of interest (ROI) is obtained according to the correlation of signal intensity in ROI, which has a good performance for measuring local structure sharpness. Besides, the improvement of PSNR (IPSNR) [8] adopts local blocks to calculate PSNR, which concentrates on the lower bounds of filtering efficiency for the visual quality of images. Differently, texture filtering should focus on the texture smoothness rather than the sharpness of the texture. These quality evaluation methods based on local details only analyze the edge features of the image while not synthesizing the texture smoothness and the structural retention, which could not achieve a comprehensive quality measurement. To fully compare the overall performance on the measurement of texture filtered images, the early common methods: Mean Square Error (MSE) [9], Peak-Signal to Noise Ratio (PSNR) [10] and Structure Similarity Index Measure (SSIM) [11] have been utilized for comprehensively evaluating filtered images by some texture filters [12]- [14]. These common methods use a global measurement scheme, yet they could not provide an effectively comprehensive evaluation result through the flatness of texture and the sharpness of structure from a de-textured image.
The aforementioned objective evaluation metrics perform well on de-noised or de-blurred images, while they are limited to comprehensively measure the texture smoothness and the structure retention. Especially, when texture details in some texture images need to be subjectively recognized by human perception, some texture details could be easily misjudged as structures by these assessment methods. In this work, aiming at the subjective definition of textures, the locality of texture details, and the comprehensiveness of IQA, we integrate texture smoothness and structure similarity to construct a comprehensive objective evaluation index T3SI, based on the intuitive selection of textures and features. We first manually select some patches from the original image and the filtered image, and these patches should respectively contain relatively more textures and features defined by human perception. Thus, these selected pathes can represent texture and structure regions. Then, EPI is employed to measure the texture smoothness through some selected patches representing texture regions, and SSIM is used to measure the edge structure similarity based on these patches including significant features. Combining EPI and SSIM by using the exponential cross entropy, we can construct a final comprehensive exponential cross-entropy assessment function for texture filtered images.
The contributions of our approach are as follows.
• We intuitively identify textures and structural features by human perception, due to the lack of ground-truth texture-free images. The difficult quantifiable task of texture smoothing evaluation can be easily transformed into a quantified problem.
• We use local patches to distinguish image texture details from structural details, which could ensure the accurate measurements for the texture smoothness and the structural similarity.
• Our T3SI synthesizes the local evaluated results of texture smoothness and structure retention via the exponential entropy, which can provide a comprehensive and quantitative measurement for texture filtering. The remainder of this paper is organized as follows. In Section II, we briefly reviewed the related work, including no-reference and full-reference IQA methods. In Section III & IV, we specifically described the idea of our approach, including the recognition and definition of texture regions, the measurement of texture smoothness and structure similarity, and the construction of comprehensive indexT3SI. In Section V, we analyzed the implementation of our method and compared with several similar methods, followed by the conclusion in Section VI.

II. RELATED WORK
Among IQA metrics, some full-reference approaches could be used for measuring low-scale texture images. Generally, since most texture images obtained from nature usually have no real background, no-reference IQA models have a certain applicability in texture images. In this section, we mainly introduce the related no-reference and full-reference methods of image quality assessment.
The current no-reference image quality assessment is a very popular research, and various excellent algorithms [15]- [19] have been designed for evaluating different distorted images. NIQE [15] mainly considers the construction of a 'quality aware' collection of statistical features based on a natural scene statistic (NSS) model. Similarly, Appina et al. [16] proposed a non-reference image quality evaluation model based on natural scene statistics, which can accurately extract the luminance coefficient and the parallax subband coefficient and achieve excellent performance. PIQE [17] is an opinion unaware methodology that attempts to quantify distortion by extracting local features for predicting quality, and BRISQUE [18] uses scene statistics of locally normalized luminance coefficients to quantify possible losses of ''naturalness'' in the image. Differently, Akhter et al. [19] first extracts the segmentation and disparity map features from the left and right views, and uses the logistic regression to predict the image quality score. Besides, some no-reference IQA models [19]- [22] have been developed for perceptual quality evaluation of stereoscopic images. These no-reference methods can analyze image quality based on the 'sharpness' of statistically feature details of distorted images. Whereas for complex texture filtered images, the result of adequate smoothing texture image will result in the opposite evaluation.
Unlike no-reference IQA approaches, full-reference models are more widely applied for performing the quality assessment of images with a truth-background, such as images added virtual noise, enhanced images and segmented images. The evaluation metrics mainly have PSNR [23], MSE [9], [24], [25] and SSIM [11]. PSNR is commonly used to evaluate noise images. MSE and its extension PAMSE [26] and CW-MSE [27] were proposed based on average error. They measure the quality of an image by calculating the global size of the pixel error between the distorted image and the reference image, which are relatively simple and easy to implement. Considering the statistical correlation of structure features, SSIM and its improvements: MS-SSIM [28], MS-SSIM [29], RFSIM [30] are also widely use for various image quality assessment. The similar methods of SSIM define the structural information as the structure of the object according to the brightness and contrast, which could perform well on the image with obvious edges feature. To highlight the local region of interest, QILV [31] utilizes the comparison of the local variance distribution of two images to assess the non-stationarity of images. In addition, SFF [32] adopts the sparse features by a feature detector, and IFS [33] uses only a part of the reference image information. Since the loss of texture details is regarded as a significant index, these full-reference metrics are also difficult to provide a comprehensive measurement of textures and structures.
Our evaluation approach is devised to comprehensively measure the smoothness of texture and the retention of structure based on a relative full-reference.

III. OVERVIEW
To provide an overall view, we illustrate the pipeline in Fig. 1. and succinctly introduce it as follows.
First, we intuitively identify some texture/structure regions on the original image by human perception and give some clicks on the centers of these regions by the mouse. Setting an appropriate radius r, we can obtain some texture/structure patches with the size (r + 1) × (r + 1) centered at these clicked points. Similarly, according to the positions clicked on the original image, we can also crop some texture/structure patches from the filtered image.
Second, based on these texture/structure patches, we employ EPI and SSIM to calculate the negative texture smoothness and the structure similarity, respectively.
Finally, combining the results of 1-EPI and SSIM, we formulate an exponential cross-entropy function (T3SI) to evaluate texture filtered images comprehensively.

IV. METHOD
Due to the locality and unrecognizability of textures/ structures, we construct a comprehensive assessment metric from the following three aspects: 1) the determination of texture and structure, 2) the measurement of texture smoothness and structural retention, 3) the formulation of comprehensive evaluation index.

A. DETERMINATION OF TEXTURE AND STRUCTURE
The accurate identification of textures and structures is a prerequisite for IQA. Generally, the textures and structures in an image are not distinguished by a strict criteria, and even the different texture regions may be defined from the same image by various filtering methods. For instance, the flecks on giant salamander in Fig. 2 (a) are structural features but are regarded as texture details by some filtering methods in specific applications. In order to suit the actual applications, we adopted the intuitive method that is also the most direct and effective approach to identify textures/structures by human perception.
Since the texture region has a structural consistency between the local region and the whole region, we could utilize some local texture patches to substitute the whole texture region and achieve the measurement of the texture smoothness. Recognized texture areas by human perception and given N clicks inside texture areas (black background in Fig. 2 (a) ) by the mouse, the positions p i (i = 1, 2, . . . , N ) (yellow ' ' in Fig. 2 (a)) representing texture could be marked on the original texture image. We set an appropriate patch radius r to ensure that each selected patch i with the size (2r + 1) × (2r + 1) centered at p i is located inside texture areas, and abstract some box patches according to the selected points p i (i = 1, 2, . . . , N ). Once all box patches are cropped from one texture region, we obtain an accurate texture patches' set S = N i=1 i which can represent all texture regions (see yellow boxes in Fig. 2 (b)).
The structure positions q i (i = 1, 2, . . . , N ) (red ' ' in Fig. 2 (a)) should be located on some areas including lots of structure features. Similar to texture patches, we can also Fig. 2 (b)). Here i represents the i-th structural patch.
According to the same positions as the corresponding clicks' positions p i , q i (i = 1, . . . , N ), we could also obtain patches representing texture regions S 1 and structure regions S 1 from the filtered image (see Fig. 2 (c)). By using these selected patch sets S,S, S 1 ,S 1 from the original image and the filtered image, we can calculate the texture smoothness and the structure retention.

B. TEXTURE SMOOTHNESS
We consider the residual edge structures in texture regions as the negative measurement of the texture smoothness, and adopt the edge preservation index EPI [5] to count the retention of edges, since EPI based on the region of interest (ROI) can provide a good supplementary evaluation for edge structures in the local texture region. Not difficult to find, when a texture region is fully smoothed, there is a large difference between the clean original region and the filtered region. It means that a smoother texture region corresponds to a smaller value of EPI. Hence, the negative value of 1 − EPI could be directly used to measure the smoothness of texture regions.
EPI [5] can be expressed as wherex andx are mean pixel values in the sets S and S 1 from the original image and the filtered image, and where T represents the number of pixels in S, x i andx i are the pixel value in S and S 1 . To get significant edge information, the original and filtered images need to be operated by a Laplacian high-pass filter, respectively.

C. STRUCTURE SIMILARITY
The structure similarity ofS andS 1 could be regarded as the retention of structure features on the filtered image.
Unlike the texture smoothness, the structure similarity may require more information to measure the distortion of structural details, such as brightness, contrast and structural composition, etc. Since Structure Similarity Index Measure (SSIM) [11] could reflect these properties of the object structure in the scene, we use SSIM as the structure similarity index to measure the retention of features in structure regions. SSIM [11] can be expressed as where νS and νS 1 are the average intensity of patchesS andS 1 , respectively. The parameters σ 2 S and σ 2 S 1 represent corresponding standard deviations of the setsS andS 1 . σSS 1 is the covariance betweenS andS 1 . C 1 and C 2 are constant, which can be usually set: C1 = (K 1 * L) 2 , C2 = (K 2 * L) 2 , K 1 = 0.01, K 2 = 0.03, L = 255.

D. EVALUATION INDEX T3SI
According to the analysis of 1−EPI in section IV-B, 1−EPI is positive related to the texture smoothness, which could be use to measure the loss information of texture details. Similarly, the structure similarity index SSIM can also positively measure the retention information of feature details. 1 − EPI and SSIM belong to [0,1], which could be regarded as the information energy. Higher values of them indicate higher information energy. As is well known, the entropy function −xlog 2 (x) is very suitable for the formulation of the information energy loss. Consequently, we could synthesize the two information indexes 1 − EPI and SSIM to construct a cross entropy function, named Texture Smoothness and Structure Similarity Index (T3SI). T3SI is used to evaluated texture smoothed images, and can be defined as where ξ = EPI is calculated by sets S andS, and ϑ = SSIM is computed through the sets S 1 andS 1 .
Since ξ and θ belong to the interval [0, 1], T3SI is not a monotonic increasing function on the variable of ξ and θ (see Fig. 3 (a)). The non-uniform change of the function T3SI produced by the value of the variable ξ and θ will lead to an unsuccessful comprehensive quality assessment. Analyzing the entropy formulation −xlog 2 (x), we can easily obtain the VOLUME 8, 2020 maximum value of the entropy function, when x = 0.3679. Therefore, we normalize ξ and θ to belong to (0, 0.36], then T3SI is a monotonous and increasing function. T3SI is proportional to ξ and θ as shown in Fig. 3 (b)).
The normalized formulation of ξ and θ can be expressed asξ where ε is set as an infinitesimal constant, such thatξ andθ are not equal to 0. Due to the loss of calculation error, the value of T3SI for the similar filtered images will be same. The same value of T3SI could not achieve a comparative evaluation for different filtering methods. To reflect the obvious comparison between the different evaluation results of the filtering methods, we use the exponential function to further amplify T3SI. The final formulation of T3SI is written as follows. Fig. 4 shows the comparison of the exponential T3SI and T3SI, from which we can easily find that the change ratio of the exponential T3SI obtained by Eq. 6 is obviously greater than that of T3SI computed by Eq. 3. The large change result of the exponential T3SI with a slight change of variables is helpful for comparing the final evaluation results.
The pseudo code of our T3SI is detailed in

V. EXPERIMENT ANALYSIS
We collect some classic and meaningful texture images (see Fig. 5) as a reference set, and use excellent texture methods: RGF [34], BF [35], RTV [36], mRTV [37] and ROG [38] to perform on the set. After the set is processed by each filtering method with different parameters, we can obtain 5 groups of test sets, and each group test set contains 48 filtered images that could be used to detect the evaluation effect of different metrics in subsequent analysis and experiments.

A. IMPACT OF PATCH
The result of T3SI ultimately depends on the texture and structure patches, and a different selection of patches will result in a different value of T3SI. In order to analyzing the impact of the patches on T3SI, we chose a typical mosaic filtered picture 'harp' from one group of test set, and performed experiments on different positions and number of patches, respectively.

1) IMPACT OF PATCHES' LOCATIONS
Obviously, if we mistake the structural area as a texture region to calculate EPI, we will inevitably obtain a wrong evaluation result. Therefor, texture patches should contain as few feature details as possible for subjective applications, and structure patches should also include significant edge features. Putting the same number of clicks inside different texture regions and structure regions four times, we can obtain four texture and structure patches' sets. Fig. 6 (a) -(d)) only shows four different locations of texture and structure patches on the filtered image, respectively. Based on these patches of four random locations, EPI, SSIM and T3SI of the filtered image were computed in Tab. 1. The selection of texture patches in Fig. 6 (a) -(c) are fixed in some positions, and the structural patches are located in different positions each time. Conversely, the selection of structure patches in Fig. 6 (e) -(g) does not change. We could easily find from the results in Tab. 1 that the deviation of   SSIM and EPI are very small. It indicates that patches' positions have a less effect on T3SI. T3SI has global stability in the selection of patches' locations.

2) IMPACT OF PATCHES' NUMBER
We randomly select nine filtered images with typical textures and features from a test set, and set a moderate patch radius r = 12 to ensure all patches inside texture regions for each test image. The values of T3SI are calculated through the different number of patches, and the relationship between T3SI and the number of patches is shown in Fig. 7. We could find that T3SI is basically stable when choosing 5 patches. It shows that only 4-6 patches containing general textures and significant structures could accurately calculate the index T3SI.

B. COMPARATIVE ANALYSIS
We design an independent User Study for verifying the comparison of our T3SI and other IQA metrics. For the image sharpness, the texture smoothness and the structure retention, we invite 100 volunteers to comprehensively evaluate the visual quality according to ten scores from 1 to 10. The final comprehensive evaluation result of each filtered image is given by the average score of 100 volunteers. Compared User Study with our proposed metric, the results of test images demonstrate our proposed comprehensive evaluation for each test image set is almost consistent with the subjective User Study. More experimental results are exhibited in the following comparison.

1) OVERALL PERFORMANCE
Choosing recent excellent IQA metrics: NIQE [15], PIQE [17], BRISQUE [18], SFF [32], IFS [33], QILV [31] and our T3SI to perform one group of test images, we compared the results of this metrics and User Study. According to the results of the User Study ranging from the smallest VOLUME 8, 2020  to the largest, the each group of test filtered images are sequentially labeled i(i = {1, 2, · · ·}) accordingly, and the results of other evaluation methods are also sorted in the order of image labels. Normalizing the results of User Study and each metric into the interval [0,1], we could plot the comparative curves between User Study and each quality assessment method. Fig. 8 shows the comparative results with User Study, from which we can easily find that, the assessment result of T3SI is almost similar to User Study (see Fig. 8 (g)). There is a large difference between the results of other methods and User Study, since User Study may adopt through the comprehensive measurement of texture smoothness and the structure sharpness that are two essential indexes for the texture image assessment. It verifies that our proposed objective evaluation T3SI is applicable to subjective computer vision.

2) INDIVIDUAL PERFORMANCE
We respectively select a simple, a medium-scale and a complex texture image from Fig. 5, and the three corresponding texture filtered images are also taken from the 5 test sets. Using the three references and corresponding test images, the specific results of all aforementioned evaluation methods are listed in Tab. 2. For the subjective observation, we also  Fig. 9 (b) -(e)) provided by five filtering methods for the same input. The data in the table is the assessment values of various metrics: NIQE [15], PIQE [17], BRISQUE [18], SFF [32], IFS [33], QILV [31], T3SI and user study. zoomed some local details in each selected image as shown in Fig. 9 -11. We could find from the local zoomed textures and structures that, the visual results of de-texturing images Since the range of each evaluation is not uniform, we normalized eight groups of evaluation values to the interval [0.1, 0.9] for better graphical comparison. Each group of evaluation results is exhibited by five bar graphs (see Fig. 12) and the height of bar represents the texture IQA. Comparing with the heights of bars, we can find that no-reference quality assessment algorithms: NIQE, PIQE and BRISQUE are obviously superior to the methods (SFF, ISF, QILV) with reference when performing on simple (see the 1st -3rd groups of color-bars in Fig. 12 (a) ). Due to the unilateral measurement of structures, the height order of bars shows full-reference metrics have converse results to User Study as the 4th -5th and 8th group of color-bars shown in Fig. 12 (a) -(c).
For the medium and complex texture filtered images, no-reference quality assessment algorithms are almost  difficult to obtain the effective measurement (see the 1st -3rd groups of color-bars Fig. 12 (b) &(c)). Conversely, the seventh and eighth groups of bars in Fig. 12 (a) -(c) show that, no matter what scale texture images, our T3SI is close to the results of User Study for evaluating texture filtered images.

3) PERFORMANCE ON NOISY IMAGES
In order to perform T3SI on noisy images, we consider the noisy image as the reference for calculating EPI and the clean image as the reference input of SSIM. Choosing five images with a large flat region from the LIVE database [39], [40], we perform filtering methods on the noisy images and obtain denoised image as shown in Fig. 13 (c) -(e). When we regard the noise in flat areas as the texture, the cleanliness of flat areas is similar to the texture smoothness. Hence, we can select some patches inside flat regions to measure the removal of noise, and choose some patches containing large edge features to evaluate the retention of structures. Based on the attribute of the two measurements, T3SI could be used to evaluate the noisy image. We compared our T3SI with the metrics: NIQE [15], PIQE [17], BRISQUE [18], SFF [32], IFS [33], QILV [31], PSNR [23] and SSIM [11]. The results of evaluation are detailed in Tab. 3. Since PSNR and SSIM are generally used to evaluate denoised images quality, the results of other metrics are compared with that of them. We could observe from PSNR and SSIM that the three denoised results of each noisy image show: (e) is the best, (d) is the second, and (c) is the worst. Compared these metrics, the full-reference methods SFF [32], IFS [33], QILV [31] and our approach for   Fig. 13 (c) -(e)). The data in the table is the assessment values of various metrics: NIQE [15], PIQE [17], BRISQUE [18], SFF [32], IFS [33], QILV [31], PSNR [23], SSIM [11] and T3SI.
the same noisy image are consistent with PSNR/SSIM, while the no-reference methods perform poorly. Actually, we could intuitively find that the denoising effects in (d) & (e) are obviously superior to that of in (c). Our T3SI is in line with subjective evaluation, and experimental results demonstrate that our method has a good subjective applicability for IQA. VOLUME 8, 2020 VI. CONCLUSION We present an interactive assessment method T3SI for measuring texture filtered images that are difficult to evaluate by existing IQA metrics. T3SI constructed by using an exponential cross entropy formulation can comprehensively evaluate the texture smoothness and the structure retention, which could make objective evaluation consistent with subjective practical applications. Especially, the local patches selected by human perception has advantage of the accurate identification of textures, by which T3SI can achieve the unity of the local and overall assessment, the objective and subjective evaluation. T3SI is significant for image texture quality assessment, and numerous experiments demonstrate that T3SI is stable, comprehensive and quantitative in TIQA.
In our future work, we attempt to improve the adoptive selection of patches according to different texture patterns, such that our proposed T3SI can be use to assess a mass of texture filtered images.
CHONG LIU received the bachelor's degree in information and computing from the Harbin University of Technology, in 2002. He is currently pursuing the Ph.D. degree with the Nanjing University of Aeronautics and Astronautics, China. Since 2005, he has been a Lecturer with Anqing Normal University. His research interests include numerical computation, image processing, and machine learning. His research interests include geometry processing and geometric modeling, especially large-scale LiDAR point data capturing, management, processing, and analysis. VOLUME 8, 2020