Comprehensive Criteria-Based Generalized Steganalysis Feature Selection Method

Redundant steganalysis feature components in high-dimensional steganalysis feature of images increase the spatio-temporal complexity of steganalysis and even reduce the detection accuracy of the stego images. In order to reduce the image steganalysis feature dimension, improve the detection accuracy of the stego images and achieve fast feature selection, this paper proposes a general method for image steganalysis feature selection. Firstly, a feature metric algorithm based on the difference function is given, and this algorithm measures the difference of the steganalysis feature components between the cover image class and the stego image class, which provides the basis for selecting the steganalysis feature components contributing greatly to detect the stego images. Secondly, the Pearson correlation coefficient is improved and used to measure the correlation between the steganalysis feature components and the image classification results to provide the basis for removing the redundant steganalysis feature components. And then, by setting thresholds, the steganalysis feature components with a large difference function value are selected and with a small Pearson correlation coefficient are deleted. Finally, the steganalysis feature components retained are trained and detected as the final steganalysis feature. A series of experimental results indicate that this method can reduce the feature dimension effectively and the spatio-temporal complexity of steganalysis, while maintaining or even improving the detection accuracy of the stego images. Compared to existing steganalysis feature selection methods such as Fisher-GFR and Improved-Fisher, this method has a higher detection accuracy of the stego images after simplification.


I. INTRODUCTION
Steganography, a significant branch of multimedia security, is a technique for embedding hidden information in public carriers and transmitting it over public channels so that the information is disguised from others. And steganalysis is designed to extract information that has been hidden by steganography for the purpose of securing information. So far, many steganalysis algorithms have emerged, such as the 548-dimensional CC-PEV [1], [2] (Cartesian Calibration based feature proposed by PEVný) steganalysis feature that The associate editor coordinating the review of this manuscript and approving it for publication was Gautam Srivastava . enhances the PEV function through Cartesian calibration, the 686-dimensional SPAM [3] (Subtractive Pixel Adjacency Matrix) steganalysis feature based on second-order Markov chain and the 1,234-dimensional CDF [4] (Cross-Domain Feature) steganalysis feature that combines CC-PEV and SPAM functions. With the digital media rapidly developing, how to improve the efficiency of digital steganalysis has become an urgent problem. Therefore, there has been an increasing amount of interest in the research of the digital image adaptive steganalysis algorithm. Mainly by extracting the features of the stego images and using the classifier for training and detection, it has obtained great detection results. High-dimensional steganalysis features of images that are commonly seen include the 17,000-dimensional GFR [5] (Gabor Filter Residual) steganalysis feature that can describe images from different scales and directions using the Gabor filter, the 22,510-dimensional CC-JRM [6](Cartesian-Calibrated JPEG Rich Model) steganalysis feature that is built from a sub-model system with a joint distribution of the frequency and spatial domain DCT coefficients covering a wide range of statistical correlations, and the 34,671-dimensional SRM [7](Rich Models for Steganalysis) steganalysis feature of the full spatial domain rich model. Although high-dimensional feature can achieve a low detection error rate for the image adaptive steganalysis, the higher feature dimensions extracted by the adaptive steganalysis algorithms results in an even great spatio-temporal complexity, which affects the development of the image steganalysis technology. Therefore, how to select high-quality features to reduce the spatio-temporal complexity of detecting the stego images has become the focus of current research in the field of steganalysis.
Currently, some scholars have conducted a series of studies on the feature selection and dimension reduction for steganalysis. Depending on the selection of the selected objects, the existing methods for the selection of steganalysis feature can be classified as the specific steganalysis feature selection and the general steganalysis feature selection. The specific method for steganalysis feature selection is a selection method for a certain steganalysis feature. This type of the methods is relatively simple compared to the general steganalysis feature selection, but its scope of application is comparatively narrow. Classic methods of this type are: Reference [8] proposed a Fisher criterion based GFR feature subspace selection algorithm (abbreviated as Fisher-GFR method). With this method, more efficient feature subspaces can be selected to target and improve the detection performance of GFR steganalysis feature of the stego images, however, the detection accuracy of the stego images obtained by this method has not been improved significantly when the quality factor gets higher. In our previous work, we proposed a multi-scale GFR-based feature selection method for steganalysis of GFR, which combines SNR criterion and the improved Relief algorithm to remove useless and low-contribution steganalysis feature components, significantly reducing the feature dimension while maintaining detection accuracy, however, with no significant effect on other steganalysis features [9].
The general method for steganalysis feature selection is suitable for measuring various steganalysis features, measuring the contribution of steganalysis feature to detect the stego images, and selecting the steganalysis feature components that contribute significantly to detect the stego images as steganalysis feature vectors for training and detection. Reference [10] proposed three different evolutionary computational techniques for feature selection, where the DE method performs optimally in terms of detection accuracy and feature dimension and is easier to implement, however, all three evolutionary algorithms are time-consuming in the feature selection process. Reference [11] proposed a Rich Model feature optimization method based on the improved Fisher criterion, which evaluates the separability of the steganalysis feature component, sub-model feature, and steganalysis feature vectors, respectively, using the improved Fisher criterion (abbreviated as Improved-Fisher method). This method can significantly reduce the feature dimension, however, the detection accuracy of the stego images needs to be further improved. Reference [12] proposed a method for the steganalysis feature selection based on the decision rough set α-positive region simplification. This method not only reduces the steganalysis feature dimension and maintains the detection accuracy of the stego images, however, the feature selection is dependent on the results of the classifier and requires repeated training, which outcomes in a high temporal complexity of the selection.
So far, the detection accuracy of the selected features of the existing methods for the stego images has yet to be further improved. In response to these questions, this paper attempts to propose a generalized steganalysis feature selection method based on comprehensive criteria (brief as CGSM method). Firstly, we aim to give a feature metric algorithm based on the difference function, according to which we measure the difference of the steganalysis feature components between the cover image class and the stego image class. Secondly, we will improve the Pearson correlation coefficient and apply it to measure the correlation between the steganalysis feature components and the image classification results. And then, the steganalysis feature components with a larger value of the difference function are selected and those with Pearson correlation coefficient that is lower than the threshold are deleted by setting the threshold of the difference function and Pearson correlation coefficient. Finally, the retained steganalysis feature components are trained and detected as the final selected steganalysis feature vectors. The method is expected to reduce the feature dimension and improve the detection accuracy of the stego images, while reducing the dependence on the selected feature of the classification test results, thereby the spatio-temporal complexity of the steganalysis feature to detect the stego images can be reduced.
The rest of the paper is arranged as follows: Section II describes the related work. Section III gives a metric of the contribution of the steganalysis feature components. Section IV proposes a feature selection method based on multi-criteria decision making. Section V analyzes the effect of the CGSM method on detecting the stego images through a series of experiments. Section VI further discusses the experiment in this paper. Section VII summarizes the full text.

II. RELATED WORK
The difference function in Pignistic probability [13] is a mathematical formula that effectively measures the difference between two different classes, as follows.
where, BetP m 1 (A) and BetP m 2 (A) represent the Pignistic probability values of the two classes, respectively, difBetP m 1 m 2 represents the difference function between two classes, |X | represents the number of elements in the set containing evidence A, m(X ) represents the value of the probabilistic mass (Mass) function of containing evidence function, i.e. the total probability of all sets containing an element, m(φ) represents the total probability of all sets containing empty sets. Equation (1) converts the incomprehensible Mass function into a probabilistic form that matches people's intuition. (i.e. BetP m i (A)), and then, The difference between the two classes was measured utilizing (2),i.e. the value of difBetP m 1 m 2 . The Pearson correlation coefficient can be used to efficiently measure the degree of correlation between two classes [14], the main idea is that the stronger the positive correlation between two classes, the greater the Pearson correlation coefficient. The formula is as follows.
where, cov(m 1 , m 2 ) represents the covariance between m 1 and m 2 , σ m 1 · σ m 2 represents the product of the respective standard deviations of m 1 and m 2 . The larger the value of (3) calculation, the stronger the correlation between the two classes.

III. CONTRIBUTION METRIC OF THE STEGANALYSIS FEATURE COMPONENTS
The detection effect of the stego images depends to some extent on the size of the contribution of the steganalysis feature components to detect the stego images. Because not all of the steganalysis feature components contribute equally to detect the stego images, and it is not realistic to obtain all features in a limited training set, it is necessary to measure the contribution of the steganalysis feature components to detect the stego images. And then selecting the steganalysis feature components that contribute significantly to the training and detection. A large contribution of the steganalysis feature components will increase the detection accuracy of the stego images, while a small or no contribution of the steganalysis feature components will lead to an increase in the feature dimension, which will enlarge the spatio-temporal complexity of detecting the stego images and even decrease the detection accuracy of the stego images. Thus, we have proposed a difference function-based metric to measure the contribution of the steganalysis feature components to detect the stego images. However, the widely different steganalysis feature components are not necessarily all favorable of detecting the stego images, and taking into account the great correlation with the image classification results of the steganalysis feature components are more favorable of detecting the stego images. Therefore, in order to measure the correlation between the steganalysis feature components and the image classification results, the Pearson correlation coefficient has been introduced in this paper.

A. CLASSIFICATION METRIC OF THE CONTRIBUTION OF STEGANALYSIS FEATURE COMPONENTS BASED ON THE DIFFERENCE FUNCTION
In order to measure the contribution of the steganalysis feature components to detect the stego images, the difference function (i.e. difBetP S C ) based on Pignistic probability is proposed in this paper to measure the difference between the steganalysis feature components in the cover image class and the stego image class, as follows.
where  the jth cover image class or stego image class, respectively. Utilizing (4) the value of the difference function between the jth cover image class and stego image class for each steganalysis feature component can be obtained.
In order to illustrate that the large value of the difference function steganalysis feature components favoring the detection of the stego images, mathematical expressions and textual descriptions are given next. In pattern recognition, the farthest distance method can be used as the distance between two classes. Describing the farthest distance method: taking one point in each of two classes, so that the distance between the two points is farthest, and the equation is as follows. (8) where, d ab represents the distance between the classes a and b, D kl represents the maximum value of d ab .
For a more intuitive understanding of the farthest distance method, the farthest distance method diagram is shown in Figure 1.
Afterward the relative distance between the two classes is used as a criterion to illustrate that the wider the distance between classes is, the better the classification of the two classes will be. With two image classes, a and b, a i and b i represent samples in the cover image class and the stego image class, respectively, c a and c b represent the mean sample in the cover image class and the stego image class, respectively. According to the farthest distance method the distance between two classes d ab = max a i − b i can be known, the minimum boundary radius that defines the two classes (i.e. r a and r b ) is: Defines the relative distance between classes a and b (i.e. D ab ) is: As can be seen from (11), the relationship between D ab and d ab is: That is, the greater the value of the farthest distance between the cover image class and the stego image class, the greater the relative distance between them, then the better the separability between the cover image class and the stego image class and the lower the classification error. We take the cover image class and the stego image class as two classes, and the greater the distance between the two classes, the better the effect on classification and the greater the contribution to detect the stego images. Therefore, in the selection of steganalysis features, the selection of wide differences in the steganalysis features is expected to improve the detection accuracy of the stego images while reducing the feature dimension.
From the above, it can be seen that the larger the difference function value of the steganalysis feature components between the cover image class and the stego image class, the greater the difference between the cover image class and the stego image class for this steganalysis feature component, the more favorable the detection of the stego images, then the more likely to be valuable features, the better the effect on the detection of the stego images.
The specific steps for detecting the contribution of the steganalysis feature components to detect the stego images based on the difference function are shown in Algorithm 1.
Based on the difference function metric, the difference between the steganalysis feature components in the cover image class and the stego image class are measured, which provides a basis for selecting the steganalysis feature components that contribute significantly to detect the stego images. Although it is true that widely different steganalysis feature components may be more favorable for detecting the stego images, [15] proposed that there are wide differences between the cover image classes and the stego image classes with embedded information, i.e., not all differences in steganalysis feature components between the cover image classes and the stego image classes are necessarily favorable for detecting the cover images. And we consider that the steganalysis feature components, which are strongly correlated with the image classification results, facilitate the steganalysis feature to detect the stego images. Therefore, in the next section, we are going to introduce the Pearson correlation coefficient metric to detect the correlation between steganalysis feature components and the image classification results.

B. CORRELATION METRIC OF STEGANALYSIS FEATURE COMPONENTS BASED ON PEARSON CORRELATION COEFFICIENT
In order to further reduce the feature dimension and improve the detection accuracy of the stego images, considering that not all of the steganalysis feature components with large difference function values are conducive to detect the stego images, and that the stronger the correlation with the image classification results, the more conducive the steganalysis feature components are to detect the stego images. Therefore, this paper has refined the Pearson correlation coefficient and applied it to measure the correlation between steganalysis feature components and the image classification results. Traditional Pearson correlation coefficient in steganalysis is able to calculate correlations between steganalysis feature components. The stronger the correlation between the two steganalysis feature components is, the more similar they are. Then one of them can be selected for training for the purpose of dimension reduction. However, when the traditional Pearson correlation coefficient is applied to the steganalysis feature dimension reduction, the correlation of each steganalysis feature component and all the remaining steganalysis feature components needs to be measured in a loop, and then those steganalysis feature components with high correlation are deleted, which increases the temporal complexity of the metric, which would not apply to the development of rapid steganalysis.
As can be seen from [16] described by Peng et al., the better the steganalysis feature components based on the mutual information metric and the stronger the correlation between the image classification results, the more valuable the steganalysis feature component is to detect the stego images. Given that selecting high correlation steganalysis feature  (13) where, cov(f C i , f S i ) represents the covariance of the th steganalysis feature component between the cover image class and the stego image class. σ f C i · σ f S i represents the product of the standard deviation of the steganalysis feature component in the cover image class and the stego image class.
The larger the Pearson correlation coefficient (i.e. ρ S C ) between the cover image class and the stego image class, the stronger the correlation between the cover image class and the image classification results, then the better the steganalysis feature component are for image classification performance, i.e. the better the detection of stego images. Therefore, selecting the Pearson correlation coefficient greater than or equal to the threshold of the steganalysis feature component as the final steganalysis feature vectors for training can reduce the dimension of the steganalysis feature.
The specific steps for detecting steganalysis feature components correlation based on the Pearson correlation coefficient metric are shown in Algorithm 2.
In this way, this paper not only measures the difference between the steganalysis feature components in the cover image class and the stego image class based on the difference function to provide a basis for selecting the steganalysis feature components that contribute significantly to detection of the stego images, but also improves the Pearson correlation coefficient to measure the correlation between the steganalysis feature components and the image classification results to provide a basis for removing redundant steganalysis feature components. In the next section, we will give the general flow of the proposed method and the algorithm of the method for the readers to better understand how the method works in this paper.

IV. FEATURE SELECTION METHOD BASED ON COMPREHENSIVE CRITERIA DECISION MAKING
In this paper, a CGSM method is proposed, which reduces the dimension of the steganalysis feature and improves the detection accuracy of the stego images, and can reduce the dependence on the selected features of classification detection results, so as to reduce the spatio-temporal complexity of detecting the stego images.

A. MAIN STEPS OF CGSM METHOD
This paper proposes a CGSM method. Firstly, a difference function is proposed to measure the difference of the steganalysis feature components between the cover image class and the stego image class, and the value of the difference function is ranked in descending order. Then, by setting a threshold m, the steganalysis feature components with a wide difference in the first m are retained. Secondly, the Pearson correlation coefficient is improved and applied to measure the correlation between the steganalysis feature components and the image classification results. Thirdly, by setting a threshold τ , the steganalysis feature components with Pearson correlation coefficient lower than the threshold are deleted. Finally, the steganalysis feature components with Pearson correlation coefficient greater than or equal to the threshold are retained as the final steganalysis feature vectors. The main steps are as follows.
(1) Constructing feature sets. Steganalysis feature vectors for a cover image class and five stego image classes with different payloads are generated as training sets using the steganographic and extraction algorithms.  (5) where the Pearson correlation coefficient is lower than the threshold. (7) Determining the final selection of the steganalysis feature. The steganalysis feature components with Pearson correlation coefficients greater than or equal to the threshold is selected and retained as the final steganalysis feature vectors.
After the above steps, we not only select the feature components with high contribution to classification, but also remove the interference of the feature components with low correlation, which greatly reduces the feature dimension and thus reduces the spatio-temporal complexity of detecting the stego images, and may also improve the detection accuracy of the stego images.
According to the main steps, we give Figure 2 to more graphically describe the CGSM method process.
In order to better understand the working principle of the CGSM method, the specific algorithm for CGSM method is given as Algorithm 3.
In this order, we both selected the steganalysis feature components with a wide difference between the cover image class and the stego image class, and based on the Pearson correlation coefficient, removal of a portion of the steganalysis feature components from the larger value of the difference function and weak correlation with the image classification results, further reducing the feature dimension and thus the space complexity of the classifier of detecting the stego images.

B. THRESHOLD ANALYSIS
In order to have a better selection of CGSM method, we need to explain the difference function threshold m and the Pearson correlation coefficient threshold τ .
Firstly, by setting the step length of the difference function threshold m, the number of selected feature dimensions is progressively reduced to observe the corresponding detection accuracy of the trend; Secondly, we reduce the m step between the two higher detection accuracies to 1/2 of the original step length and continue to observe the corresponding variation of the detection accuracy; Thirdly, the operation of cycling reduction the step length provides the basis for finding a better m. It is clear from previous work that in experiments on the steganalysis feature selection, reducing the feature dimension to about half of the original when the detection accuracy decreases significantly. Therefore, we can reduce the feature dimension to at most close to half of the original by setting the m step length.
Since the correlation coefficient ρ ∈ [0.9, 1] indicates that the two variables are strongly correlated, we only need VOLUME 8, 2020 Calculating the Pearson correlation coefficient between the cover image class and the stego image class for the ith steganalysis feature component according to Algorithm 2 (Section III-B); 13: end for 14: Setting threshold τ ; 15: Removal of the Pearson correlation coefficient calculated in step 12 is lower than the threshold of the steganalysis feature components; 16: The steganalysis feature components with Pearson correlation coefficient greater than or equal to the threshold is selected and retained as the final selected steganalysis feature; at the higher of the two detection accuracies increases the experimental accuracy; Finally, the optimal detection accuracy and the corresponding feature dimension are selected as the result of the selected feature of CGSM method of detecting the stego images. In order to better understand the setting of thresholds in CGSM method, the specific algorithm for setting thresholds is given as follow.
(1) Setting the initial difference function threshold. By setting the step length of the difference function threshold m so that the number of feature dimension thresholds obtained initially is appropriate, the selected feature dimension is progressive. The selected feature dimension is gradually decreased, and the change trend of corresponding detection accuracy is observed, so as to provide a basis for finding the feature with high detection accuracy. (2) Making the difference function thresholds accurate progressively.
1 Changing the step length of m between the two higher detection accuracies, we decrease the step length to 1/2 of the original and continue to observe the change trend in the corresponding detection accuracy.
2 Looping reduction the step length operation provides the basis for finding a better m. 1 Changing the step length of τ between the two higher detection accuracies, we decrease the step length to 1/2 of the original and continue to observe the change trend in the corresponding detection accuracy.
2 Looping reduction the step length operation provides the basis for finding a better τ . (5) Determining the final threshold. Selecting the threshold that is optimal for the detection accuracy of the stego images as the threshold in CGSM method for training and detection. In this way, we determine the thresholds that provide the basis for finding a way to make CGSM method have a better selection effect.

C. PERFORMANCE ANALYSIS
Next we analyze the temporal complexity of CGSM method proposed in this paper to give the readers a better understanding of the algorithm performance in this paper. CGSM method is divided into calculating the difference function value, calculating the Pearson correlation coefficient, ranking the difference function values in descending order, deleting feature components that are less than the threshold and selecting the steganalysis feature are some of the steps. We perform the temporal complexity analysis for the different steps, respectively, as shown in Table 1.
In Table 1, there is no nested relationship between the individual steps listed, so the temporal complexity of CGSM method proposed in this paper is equal to the maximum temporal complexity of the individual steps. When log 2 N ≤ n, CGSM method has a temporal complexity of O(nN ); When log 2 N ≥ n, CGSM method has a temporal complexity of O (N (log 2 N )). However, most of the existing generalized feature selection methods rely on the results of the FLD integrated classifier, which has a temporal complexity of O(LN trn d 2 sub ) + O(LD 3 sub ), so the temporal complexity of such selection methods must be greater than or equal to O(LN trn d 2 sub ) + O(LD 3 sub ). It follows that the temporal complexity of the selection method that relies on the results of the integrated classifier is much greater than O(nN ) or O(N (log 2 N )), thus CGSM method greatly improves operating efficiency and reduces the temporal complexity of the classifier detection of the stego images.

V. EXPERIMENTAL RESULTS AND ANALYSIS
To test the performance of the CGSM method proposed in this paper, we conducted a series of feature selection and comparison experiments utilizing three steganalysis features of CC-PEV [1], [2], GFR [5] and CC-JRM [6]. All experiments in this paper were done on a laptop with an Intel CORE i7 with 8G of RAM. All experiments were performed in MATLAB R2018a and all figures were generated and processed in OriginPro 8.5.

A. EXPERIMENTAL SETUP
The software, hardware, image source and extraction features used in this paper are the same in all experiments, ensuring that the different methods can be fair. Each group of experiments will be repeated five times to avoid the serendipity in the selection of features and make the experiment more reliable.
We did the following with the BOSSbase 1.01 image library. Firstly, 10,000 images in the image library were converted into JPEG images with a compression quality factor of 95. Then, the SI-UNIWARD [17] steganographic algorithm was used to generate 5 × 10,000 cover images with payloads of 0.1, 0.2, 0.3, 0.4, and 0.5 (bpAC) for 10,000 stego images. Finally, the three steganalysis features of the the 548-dimensional CC-PEV [1], [2], the 17,000-dimensional GFR [5] and 22,510-dimensional CC-JRM [6] were extracted for the cover image class and stego image class, respectively, and a total of (1 + 5) × 10, 000 × 3 = 180, 000 images were obtained for the steganalysis feature set. The specific subject parameters are shown in Table 2.
In this paper, original and selected image features are trained and detected utilizing FLD integrated classifier. The main steps are: Firstly, half of the cover image features and half of the stego image features corresponding to different payloads will be randomly selected from each feature image set as the training set. Then, the remaining cover image features and the stego image features corresponding to different payloads will be used as the test set. The error rate in this integrated classifier is P E = min P FA P FA +P MD N TS , where P FA represents the false alarm rate, P MD represents the missed detection rate, N TS represents the number of test sets. Because the test set contains both a cover image set and a stego image set (i.e. N TS = 2). The error rate represents the proportion of the total number of classification errors in the test steganalysis feature components. The training and detection experiment will be repeated 10 times and the mean of the 10 results will be calculated as the final average detection error rate (i.e. P E ). The lower average detection error rate is, the better performance the stego images have, the better performance for steganalysis. In order to present the results of the comparison experiment more visually, utilizing the equation P A = 1 − P E , where P A represents the average detection accuracy. P A can visually check the effect of the selected feature for detecting the stego images, and the larger P A is, the better effect of the selected feature of the method for detecting the stego images have.
Included in the experiment are 5 parts mainly: (1) Comparison experiment of CC-PEV [1], [2] steganalysis feature based on CGSM method before and after selection. (Section V-B) (2) Comparison experiment of GFR [5] steganalysis feature based on CGSM method before and after selection.

B. COMPARISON EXPERIMENT OF CC-PEV STEGANALYSIS FEATURE BASED ON CGSM METHOD BEFORE AND AFTER SELECTION
Reference [1], [2] proposed the 548-dimensional CC-PEV image steganalysis feature, which was enhanced by the Cartesian correction of PEV. This image steganalysis feature eliminates the dimensional catastrophe problem of some steganography schemes and is a multi-class JPEG steganalysis feature that can significantly improve performance. In order to have better selection of CGSM method, the difference function thresholds m and the value of the Pearson correlation coefficient threshold τ is analyzed. In the selection experiment of CC-JRM steganalysis feature, we set the initial step length to 48 and make threshold m accurate according to the steps in Section IV-B.
The experimental results of utilizing the difference function to reduce the CC-PEV steganalysis feature dimension and the corresponding detection accuracy are shown in Table 3.
In Table 3, the bold numbers represent the highest detection accuracy for the same payload after selection, and the numbers at the top right of the feature dimension represent the thresholds m set, e.g. the threshold is set to 548,500, 452, · · · , and 212 at the first time, it is found that the detection accuracy is higher at feature dimensions 548,500, and 452, so varying the step length between them, and reducing to half of the first step length, then the numbers of feature dimension is 524 and 476, and so on. We select the feature dimension corresponding to the highest detection accuracy as the threshold, (i.e. m = 500). Next, setting the Pearson correlation coefficient thresholds τ to further reduce the dimension of the steganalysis feature.
Utilizing the above method of setting the Pearson correlation coefficient threshold, we select CC-PEV steganalysis feature. The selected feature is used to detect the stego images and obtain the detection accuracy. The variation of the detection accuracy under different thresholds is observed.
The experimental results based on the difference function and the Pearson correlation coefficient after selection of CGSM method and the detection accuracy of the stego images are shown in Table 4.
In Table 4, 'Dim' represents the feature dimension, and P A represents the detection accuracy. From Tables 3 and 4, when Payload = 0.1, the original 548-dimensional CC-PEV steganalysis feature has a detection accuracy of 0.5201 for the stego images. While the detection accuracy is up to 0.5226 for selected feature based on CGSM method, which improves the detection accuracy by 0.25% compared to the original; When Payload = 0.2, 0.3, 0.4, and 0.5, the detection accuracy of CC-PEV feature selected based on CGSM method has been improved 0.2%, 0.22%, 0.16% and 0.21% by the original, respectively. At the same time, the dimension of the selected feature was only 75.36%-79.56% of the original feature dimension at different payloads, which reduces the cost of classifier training.
In order to more visually compare the selection of CC-PEV steganalysis feature based on CGSM method, the feature dimension before and after selection and the detection accuracy are shown in Figure 3 as follows.
In Figure 3, the horizontal axes represent the feature dimension, the vertical axes represent the corresponding detection accuracy, and the blue stars represent the optimal detection result based on CGSM method after selection of CC-PEV steganalysis feature. From the figure, it can be clearly seen that CGSM method proposed in this paper can   improve the detection accuracy of CC-PEV steganalysis feature for the stego images while significantly reducing the feature dimension, which indicates the effectiveness of CGSM method. And we can observe that at each payload, the trend of broken line in the figure increases first and then decreases.
The reason for this phenomenon is that when the threshold is too small, sufficient redundant features can't be excluded, which affects the detection accuracy of steganalysis feature. On the contrary, when the threshold is too large, the features that are useful for detecting the stego images will also be excluded, leading to a decrease in the detection accuracy of feature after selection.

C. COMPARISON EXPERIMENT OF GFR STEGANALYSIS FEATURE BASED ON CGSM METHOD BEFORE AND AFTER SELECTION
Reference [5] proposed the 17,000-dimensional GFR image steganalysis feature, which was a JPEG rich model constructed using the Gabor filter, and the resulting steganalysis feature is obtained. This feature describes image from different scales and orientations with good detection performance.
In order to have better selection of CGSM method, the difference function thresholds m and the value of the Pearson correlation coefficient threshold τ is analyzed. In the selection experiment of GFR steganalysis feature, we set the initial step length to 2,000 and make threshold m accurate according to the steps in Section IV-B.
The experimental results of utilizing the difference function to reduce GFR steganalysis feature dimension and the corresponding detection accuracy are shown in Table 5.
In Table 5, the bold numbers represent the highest detection accuracy for the same payload after selection, and the numbers at the upper right of the feature dimension represent the times of setting the threshold m. We select the feature dimension with good effect as the threshold, (i.e. m = 16, 000). Then, the Pearson correlation coefficient threshold is set to further reduce the steganalysis feature dimension.
Utilizing the above threshold setting method of the Pearson correlation coefficient, we select GFR steganalysis feature; Then, we select the stego images with the selected feature to observe the variation in detection accuracy under different thresholds.
The experimental results based on the difference function and the Pearson correlation coefficient after selection of CGSM method and the detection accuracy of the stego images are shown in Table 6.
In Table 6, 'Dim' represents the feature dimension, and P A represents the detection accuracy. From Tables 5 and 6, when Payload = 0.1, the detection accuracy of the original 17,000-dimension GFR steganalysis feature is 0.5168. While the detection accuracy of the selected feature based on CGSM method for the stego images can reach 0.5218, which is 0.5% higher than the original detection accuracy; when Payload = 0.2, 0.3, 0.4, and 0.5, the detection accuracy of the selected features based on CGSM method has been improved 0.24%, 0.13%, 0.13%, and 0.05% by the original, respectively. At the same time, at different payloads, the GFR feature selected dimension could be reduced to 44.78% of the original feature dimension, which reduces the cost of classifier training. And we found that the threshold τ = 0.96, 0.94, and 0.92 in the table were not selected. In order to make the table more concise, the corresponding cases of three thresholds were omitted.
In order to more visually compare the selection of GFR steganalysis feature based on CGSM method, the feature dimension before and after selection and the detection accuracy are shown in Figure 4 as follows.
In Figure 4, the horizontal axes represent the feature dimension, the vertical axes represent the corresponding detection accuracy, and the blue stars represent the optimal detection result based on CGSM method after selection of GFR steganalysis feature. It can be clearly seen from the figure that CGSM method proposed in this paper can improve the detection accuracy of GFR steganalysis feature for the stego images while greatly reducing the feature dimension, which indicates the effectiveness of this method. Moreover, we found that although features selected by CGSM method can improve the detection accuracy of the stego images under different payloads, the advantage of this method is decreasing with the payload increasing, which may be because GFR steganalysis feature is nonlinear, and the higher the payload is, the stronger the nonlinearity is. However, the Pearson correlation coefficient is more advantageous in measuring linear correlation, so CGSM method is more beneficial to select GFR steganalysis features with low payloads.
In order to test the performance of this method under other steganalysis features, experiment of CC-JRM steganalysis feature is also conducted in this paper.

D. COMPARISON EXPERIMENT OF CC-JRM STEGANALYSIS FEATURE BASED ON CGSM METHOD BEFORE AND AFTER SELECTION
Reference [6] proposed the 22,510-dimensional CC-JRM image steganalysis feature, which is built from a sub-model system formed by a joint distribution of the frequency and spatial domain DCT coefficients covering a wide range of VOLUME 8, 2020  statistical correlations, with excellent performance under both tested algorithms and payloads.
In order to make CGSM method have a better selection effect, we use the steps in Section IV-B to discuss the difference function threshold m and the Pearson correlation coefficient threshold τ .
The selected feature dimension by CGSM method based on difference function and the Pearson correlation coefficient and the corresponding detection accuracy are shown in Figure 5.
In Figure 5, the horizontal axes represent the feature dimension, the vertical axes represent the corresponding detection accuracy, the blue stars represent the feature optimal detection result selected based on CGSM method after selection of CC-JRM steganalysis feature. It can be clearly seen from the figure that CGSM method proposed in this paper can improve the detection accuracy of CC-JRM steganalysis features to the stego images while greatly reducing the feature dimension, which indicates the effectiveness of the method. It can be seen from Figure 5 that when Payload = 0.1, the detection accuracy of the original 22510-dimension CC-JRM steganalysis feature is 0.5344 for the stego images. However, the detection accuracy of the selected feature based on CGSM method for the stego images can reach 0.5360, which is 0.16% higher than the original detection accuracy. When Payload = 0.  same time, the selected feature dimension can be reduced to 44.78% of the original at different payloads, which reduces the cost of classifier training. When Payload = 0.5, the detection accuracy reduces slightly, this may be because the dimension of the selected feature is reduced to more than half of the original feature dimension. In the previous work, reducing to half of the original feature dimension, the detection accuracy will have obvious drop, so the validity of CGSM method is indicated.
Furthermore, we will compare the detection effect of CGSM method with Fisher-GFR method and Improved-Fisher method to demonstrate that the method proposed in this paper can reduce the feature dimension for low-dimensional and high-dimensional steganalysis feature while maintaining or even improving the detection accuracy of the stego images.

E. COMPARISON EXPERIMENT WITH FISHER-GFR METHOD
In order to make a fair comparison between the different methods, the experimental setup of the comparison experiment is the same as that in Section V-A. Reference [8] proposed a subspace selection algorithm for GFR steganalysis feature based on Fisher criterion. Firstly, the weights of each steganalysis feature component are calculated based on the Fisher criterion value and the basic probability value of the steganalysis feature component. And then, the steganalysis feature component is selected proportional to the weights and probability value. The method is able to select effective feature subspaces and thus improve the detection performance of GFR [5] steganalysis feature in the stego images, however, the method relies on the classifier results, which makes the selection temporal complexity higher, and the detection performance of the method in the stego images is not significantly improved when the quality factor is higher.
Since the Fisher-GFR method only selects for GFR steganalysis feature, we only compare the results of the comparison between the selected GFR steganalysis features based on this method and the method proposed in this paper.
This experimental results of the optimal values of CGSM method and the values of Fisher-GFR method at different payloads is shown in Figure 6.
In Figure 6, the figure in the middle shows the comparative experiment of CGSM method and Fisher-GFR method at five different payloads, and the surrounding four show the comparative results of the two methods more clearly. The horizontal axis of each image represents the payload, while the vertical axis represents the detection accuracy of steganalysis features reduced by the two methods at different payloads. The left node in each payload presents the detection accuracy of GFR steganalysis feature for the stego images based on the Fisher-GFR method selected, and the right node presents the detection accuracy of GFR steganalysis feature for the stego images based on the method proposed in this paper selected (written as Proposed in the figure). From the figure, it can be clearly seen that the feature selected by CGSM method has a higher detection accuracy of the stego images. When Payload = 0.1, 0.2, 0.3, 0.4, and 0.5, GFR steganalysis feature selected by CGSM method shows a higher detection accuracy than that of Fisher-GFR method, which is 0.02%, 0.19%, 0.27%, 0.20%, and 0.28%, respectively. It is indicated that the feature selected by CGSM method is superior to Fisher-GFR method in detection accuracy of the stego images.
In addition, we compared the selection time temporal complexity of the two methods. CGSM method gets less selection time than Fisher-GFR method. For example, when Payload = 0.1, it only takes 0.53 hours using CGSM method, whereas Fisher-GFR method needs to consume 97.89 hours; when Payload = 0.5, CGSM method only needs to consume 1.08 hours, whereas Fisher-GFR method needs to consume 125.6 hours. Therefore, CGSM method can significantly reduce the selection time of steganalysis feature.

F. COMPARISON EXPERIMENT WITH IMPROVED-FISHER METHOD
In order to make a fair comparison between the different methods, the experimental setup of the comparison experiment is the same as that in Section V-A. Reference [11] proposed a rich model feature optimization method based on Improved-Fisher criterion. The method is based on the principle that 'the within-class variance should be less than the variance between two classes, and evaluates the separability of feature components, sub-model features, and steganalysis feature vectors, respectively, utilizing the Improved-Fisher criterion. On this basis, two strategies for optimizing model feature are proposed. The method can significantly reduce the feature dimension and the selection time, but it needs to be further improved in terms of detection accuracy.
For the three steganalysis features of CC-PEV, GFR and CC-JRM at different payloads, the results of CGSM method and Improved-Fisher method are presented below, and the comparative experimental results are shown in Table 7.
It can be seen from Table 7 that the detection accuracy of CC-PEV, GFR, and CC-JRM steganalysis features selected by CGSM method is generally higher than that of Improved Fisher method. For example, the detection accuracy of CC-PEV steganalysis feature selected based on CGSM method increased by 0.51%, 1.04%, 0.93%, 2.14%, and 1.67%, respectively, compared with Improved-Fisher method. The detection accuracy of GFR steganalysis feature selected based on CGSM method increased by 0.65%, 0.25%, 1.95%, 1.69%, and 1.74%, respectively, compared with Improved-Fisher method. The detection accuracy of CC-JRM steganalysis feature selected based on CGSM method increased by 0.17%, 0.97%, 0.69%, 1.35%, and 2.30%, respectively, compared with Improved-Fisher method.
In order to show the experimental results of CGSM method and Improved-Fisher method more clearly, we make Figure 7 according to Table 7.
In Figure 7, (a) (b) (c)represents the detection accuracy of three steganalysis features (CC-PEV, GFR, and CC-JRM) selected based on CGSM method and Improved-Fisher method, respectively. It can be clearly seen that the detection accuracy of CGSM method for the stego images after selection is higher overall than that of Improved-Fisher method. Moreover, when the payload is higher, the difference of detection accuracy between CGSM method and Improved-Fisher method is larger. Therefore, at the higher payloads, the detection accuracy of CGSM method is more obvious than that of Improved-Fisher method for selecting the CC-PEV, GFR, and CC-JRM steganalysis features.
Furthermore, We compared the selection time of the two methods. The selection time of CGSM method is significantly lower than that of Improved-Fisher method, e.g., the selection time is 2.16 hours when Payload = 0.5, but the selection time of the Improved-Fisher method is 16.5 hours.
In summary, the detection accuracy of CC-PEV, GFR, and CC-JRM steganalysis features selected by CGSM method for the stego images is better than that by Improved-Fisher method, and CGSM method significantly improves the speed of the steganalysis feature selection and reduces the selection time of detecting the stego images.

VI. DISCUSSION
The purpose of this study is to reduce the steganalysis feature dimension while maintaining or even improving the detection accuracy. A series of experimental results indicate that the method proposed in this paper achieves the purpose of this study.
We also found that although CGSM method can reduce the feature dimension at different payloads, the detection accuracy is slightly lower at high payloads than at low payloads, so CGSM method is more conducive to select feature at low payloads.
Meanwhile, we utilized other correlation coefficients to replace the Pearson correlation coefficient in CGSM method and conducted a series of experiments, including the Spearman correlation coefficient and the Kendall correlation coefficient. From two aspects of detection accuracy and the feature dimension compared the Pearson correlation coefficient and other related coefficient of experimental results, found that the detection accuracy of feature selected for the stego images based on the Pearson correlation coefficient is higher than the rest of the two kinds of correlation coefficient. We analyzed this result and found that it may be due to the Pearson correlation coefficient is focused on measuring the linear correlation between two variables, and FLD integrated classifier is a kind of linear discriminant classifier. Therefore, steganalysis feature components with strong linear correlation based on the Pearson correlation measurement are better than those without linear correlation. However, it does not mean that the Pearson correlation coefficient is superior to other correlation analyses in any case. Different algorithms or classifiers may produce different results and conclusions.
Finally, it is worth noting that the results of the FLD integrated classifier detection vary slightly between devices, but very little. Because of its fast, easy realization and classification accuracy, it is more and more popular among scholars.

VII. CONCLUSION
To reduce the feature dimension and improve the detection accuracy of the stego images by steganalysis. This steganalysis feature selection method is based on the relationship between the difference function and Pearson correlation coefficient. Firstly, the value of the steganalysis feature components in the cover image class and the stego image class is converted to its weight in the sum of the values of the steganalysis feature components of all cover image classes or the stego image class. Secondly, based on the difference function measure of the difference between the steganalysis feature components in the cover image class and the stego image class, the calculated difference function values are arranged in descending order, and the steganalysis feature components with a larger difference function value is retained. And then, the steganalysis feature components with a larger value of the difference function are selected and with Pearson correlation coefficient lower than the threshold are deleted by setting the threshold of the difference function and Pearson correlation coefficient. Finally, the steganalysis feature components with a correlation coefficient greater than or equal to the threshold are retained as the final chosen steganalysis feature vectors. This method can effectively reduce the feature dimension while maintaining or even improving the detection accuracy of the stego images, thereby reducing the space complexity of the stego image by steganalysis. By comparing the temporal complexity of CGSM method and the selection method dependent on the results of the FLD integrated classifier, it is proved that the method in this paper can improve the operation efficiency to a great extent, thus reducing the temporal complexity of the classifier detection of the stego images and reducing the detection cost.
A large number of experiments have been performed to prove that three steganalysis feature selected by CGSM method, CC-PEV, GFR, and CC-JRM, can be used to train and detect the stego images, and the detection accuracy of the stego images can be maintained or even improved while reducing the feature dimension. In order to make a fair comparison between the different methods, the software, hardware, image source and extraction features used in all experiments in this paper are the same. Compared with Fisher-GFR method and Improved Fisher method, CGSM method has higher detection accuracy and lower space complexity. For example, when Payload = 0.3, the detection accuracy of feature by CGSM method for the stego images after selection is 0.27% higher than that of Fisher-GFR method. In a comparison experiment with the Improved-Fisher method, when Payload = 0.4, the detection accuracy of CC-PEV, GFR, and CC-JRM steganalysis features selected by CGSM method for the stego images is 2.14%, 1.69%, and 1.35% higher than that by Improved-Fisher method, respectively. It is indicated that CGSM method outperforms Fisher-GFR and Improved-Fisher methods in detection accuracy of the stego images by selecting feature. This indicates that CGSM method can reduce the feature dimension for low dimensional and high dimensional steganalysis feature while maintaining or even improving the detection accuracy of the stego images. His research interests include data analysis, and network and information security.
NING RUAN was born in Henan, China, in 1988. He received the M.S. degree from Henan Normal University, Xinxiang, China, in 2016. VOLUME 8, 2020