Fast and Automatic Image Segmentation Using Superpixel-Based Graph Clustering

Although automatic fuzzy clustering framework (AFCF) based on improved density peak clustering is able to achieve automatic and efficient image segmentation, the framework suffers from two problems. The first one is that the adaptive morphological reconstruction (AMR) employed by the AFCF is easily influenced by the initial structuring element. The second one is that the improved density peak clustering using a density balance strategy is complex for finding potential clustering centers. To address these two problems, we propose a fast and automatic image segmentation algorithm using superpixel-based graph clustering (FAS-SGC). The proposed algorithm has two major contributions. First, the AMR based on regional minimum removal (AMR-RMR) is presented to improve the superpixel result generated by the AMR. The binary morphological reconstruction is performed on a regional minimum image, which overcomes the problem that the initial structuring element of the AMR is chosen empirically, since the geometrical information of images is effectively explored and utilized. Second, we use an eigenvalue gradient clustering (EGC) instead of improved density peak (DP) algorithms to obtain potential clustering centers, since the EGC is faster and requires fewer parameters than the DP algorithm. Experiments show that the proposed algorithm is able to achieve automatic image segmentation, providing better segmentation results while requiring less execution time than other state-of-the-art algorithms.


I. INTRODUCTION
Image segmentation has been widely used in computer vision [1], remote sensing image analysis [2], biomedical research [3], industrial detection [4], etc. Popular image segmentation algorithms can be organized into two groups: unsupervised and supervised image segmentation. The former often constructs feature descriptors and then chooses a suitable classifier to achieve image segmentation, and it does not require annotated labels [5]. On the contrary, the latter does not require construction of feature descriptors but requires a large number of labels to learn image features [6]. Therefore, the former is always more flexible and can be The associate editor coordinating the review of this manuscript and approving it for publication was Xiaohui Yuan . used for different kinds of images. The later often provides better segmentation results than the former when test data meet similar distribution with the training data, but it provides worse results when the test data are independent from the distribution of training data [7]. It is clear that these two groups of segmentation algorithms have different advantages and disadvantages for different applications.
In unsupervised image segmentations, clustering is one of the most popular algorithms since it simply and directly utilizes pixel classification to achieve image segmentation [8].
In this article, we focus on image segmentation based on clustering that faces two challenges. The first challenge is that segmentation results rely on parameter tuning and the second is that the execution time of algorithms is much more for high-resolution images [9]. For instance, the number of clusters is an important parameter and an inaccurate estimation of this parameter seriously affects the final segmentation result [10]. If the value of this parameter is too large, the results tend to over-segmentation (segmentation results include a large number of small areas). On the other hand, when the value of this parameter is too small, the results tend to under-segment (some important contour details are missed). Consequently, researchers often set an empirical value to balance over-segmentation and under-segmentation. On the second challenge, we have known that clustering algorithms based on objective functions use iteration optimization to obtain the optimal clustering centers and fuzzy memberships. However, the iterative updating will cause high computational complexity. In this work, we will address these two challenges to propose a fast and automatic clustering algorithm for image segmentation.
As clustering-based image segmentation ignores spatial information of images, conventional k-means and fuzzy c-means clustering (FCM) are sensitive to noise. To address this issue, a popular idea is to use neighboring information to replace its central pixel. Based on this idea, researchers proposed many improved clustering algorithms by employing a neighboring window of fixed size. Examples include fuzzy local information c-means clustering algorithm (FLICM) [11], neighborhood weighted FCM clustering algorithm (NWFCM) [12], FLICM based on kernel metric and weighted fuzzy factor (KWFLICM) [13], deviation-sparse fuzzy c-means with neighbor information constraint (DSFCM_N) [14], and similarity measure-based probabilistic FCM considering local label information [15]. These improved algorithms achieve better noisy image segmentation since the spatial information of images is considered and utilized. However, the incorporation of spatial information into objective function often leads to computational cost. To solve this problem, researchers utilized the image histogram instead of pixel sets to obtain fast FCMs such as FCM (EnFCM) [16], fast generalized FCM algorithm (FGFCM) [17], FCM algorithm based on noise detection (NDFCM) [18], fast and robust FCM (FRFCM) [19], etc. Because the number of levels of image grayscale is much smaller than the number of pixels in an image, these improved algorithms have a high computational efficiency. However, it is difficult to apply them to multi-channel image segmentation due to the difficulty of computing the histogram of multi-channel images.
As it is unreasonable to use a neighboring window with fixed shape and size to incorporate local spatial information, those aforementioned algorithms have a limited capability for improving image segmentation effect [20]. To address this issue, researchers incorporated adaptive neighboring information into objective functions such as Liu's algorithm [21], superpixel-based clustering algorithm (SFFCM) [22], a fuzzy double c-means clustering based on sparse self-representation (FDCM-SSR [23]), and sparse learning based FCM [24]. They employ different superpixel algorithms, e.g., simple linear iteration clustering (SLIC) [25], TurborPixel [26], and watershed transform based on adaptive morphological reconstruction (AMR-WT) [27], to obtain pre-segmentation results. As a result, each pixel obtains a neighboring area with variable shape and size, which efficiently preserves the spatial structuring information of images and thus improves segmentation effect for multi-channel images. Moreover, the SFFCM has a very low computational complexity. It is also popular for improving the computational efficiency of spectral clustering algorithms since it can reduce the size of affinity matrixes [28]- [30].
Superpixel addresses the first challenge of clustering-based image segmentation algorithms, which is helpful for achieving fast image segmentation. For the second challenge of tuning clustering parameters, researchers tried to estimate potential number of clusters using different techniques, such as the density peak (DP) algorithm [31], genetic algorithms [32], particle swarm optimization [33], artificial bee colony optimization [34], density-ratio [35], etc. Amongst these, the DP algorithm proposed by Rodriguez and Laio is the most popular for data clustering due to high computational efficiency and robustness. However, it only provides a decision graph without giving the number of clusters. To apply the DP algorithm to automatic image segmentation, Lei et al. [36] proposed an automatic fuzzy clustering framework (AFCF) by employing superpixel and a density balance algorithm to obtain better segmentation results automatically. More automatic clustering approaches can be seen in [37]- [39].
Although automatic clustering algorithms are able to find potential clustering centers, most of them are complex and unsuitable for image segmentation. In this work, we propose a fast and automatic image segmentation algorithm employing superpixel-based graph clustering (FAS-SGC). The proposed algorithm has following two advantages: • FAS-SGC provides better segmentation because of the proposed improved AMR, where morphological reconstruction on regional minimum image is employed to remove the parameter of initial structuring element that is required for AMR.
• FAS-SGC provides fast for image segmentation because we employ the eigenvalue clustering instead of the density peak clustering to find potential clustering centers, as the former has a lower computational complexity than the later.
The rest of this article is organized as follows. In Section II, we present the motivation of this work. In Section III, we propose our methodology and analyze its superiority over popular algorithms. The experimental results on real images and some high-resolution images are described in Section IV. Finally, we present our conclusion in Section V.

II. MOTIVATION
Although clustering is popular for image segmentation and a large number of improved clustering algorithms have been VOLUME 8, 2020 proposed in recent years, the computational efficiency and tuning parameters present two difficulties in existing clustering algorithms. In this section, we present our motivations of achieving fast and automatic image segmentation.

A. IMAGE SUPERPIXEL USING AMR
In our previous work [27], we proposed a novel adaptive morphological reconstruction algorithm that is useful for seeded image segmentation. The AMR has three advantages: (1) The AMR employs multi-scale structuring elements to generate a new gradient image to improve gradient-based image segmentation; (2) The AMR is able to achieve hierarchical segmentation due to its monotone-increasing property and convergence property; and (3) the AMR has a high computational efficiency since it only depends on the gradient information of images. AMR requires two parameters, s and η, where the parameter s is the size of the initial structuring element and η is the minimal error. Generally, η is a constant and it is set to 0.001. In [27], we demonstrated that the final segmentation result is insensitive to η, but the result is sensitive to s when the value of s is large. Fig. 1 shows the comparison of segmentation using the AMR-WT with different values of s.
The original image is shown in Fig. 1(a). It can be seen that boundaries as shown in Fig. 1(b) are more accurate than ones in Fig. 1(c) and Fig. 1(d), but there are more small useless segmentation areas in Fig. 1(b) than Fig. 1(c) and Fig. 1(d). Fig. 1 shows that boundaries are more accurate when s is smaller while the coverage is better when s is larger. Figs. 1(e-f) show the comparison between s = 1 and s = 5. It is clear that the boundary accuracy decreases via tuning parameter s to large. Thus, the AMR-WT usually finds a balance between boundary accuracy and coverage by setting an appropriate s.
What is obvious is that the large value of s can remove small areas in segmentation results while decreasing the boundary accuracy. How can we obtain a segmentation result that achieves higher boundary accuracy and fewer segmentation areas? In practice, region merging can remove small areas while preserving boundary accuracy; it is a good idea to address the problem. However, any region merging needs to compute the feature of areas and to update the merging result iteratively, which is disadvantageous for achieving fast image segmentation.  It is well-known that watershed transform often suffers from over-segmentation [40] since a gradient image is sensitive to noise. A gradient image often includes a large number of regional minima leading to over-segmentation as shown in Fig. 2. There are two ways for the reduction of oversegmentation. One is gradient image optimization as shown in Fig. 3. The other one is seed image optimization as shown in Fig. 4. The AMR is an excellent algorithm for gradient image optimization. To improve the AMR-WT further, the regional minimum removal on AMR is considered in this article.

B. DP ALGORITHM FOR AUTOMATIC CLUSTERING
The DP algorithm [31] is often used in automatic clustering since it can find the potential clustering clusters. In this algorithm, there are two crucial quantities, namely, the local density ρ i and the minimal distance δ i . The ρ i is used to describe the density intensity of a sample x i , and it is defined as follows the quantity δ i indicates the minimal distance between the sample x i and any other samples with higher density, and it is 211528 VOLUME 8, 2020  defined as where N is the total number of samples in a data set, 1 i, j N . The d ij denotes the Euclidean distance between x i and x j , and d c denotes the cutoff distance [31]. According to (1)-(2), the i-th sample is considered as a clustering center when both ρ i and δ i are large. Therefore, it is helpful for recognizing hidden cluster centers by constructing a decision graph where ρ i is the horizontal axis and δ i is the vertical axis.
To choose potential clustering centers easily, the DP algorithm considers γ i = ρ i δ i sorted in decreasing order as the final decision graph. Fig. 5 shows the basic idea of DP algorithm.
In order to apply DP algorithm to automatic image segmentation, Lei et al. [36] employ superpixel and a density balance algorithm to improve the DP algorithm. In this algorithm, authors use superpixel to overcome the problem of memory overflow for large-scale images. Furthermore, this algorithm uses a density balance strategy to improve the accuracy of the clustering parameter estimation.
In [36], a mapping from γ i to φ i is presented where ξ (χ r ) = n j=1 ϕ j is the total number of γ i under the condition χ r − γ i η, and ϕ j is defined as follows where 1 r Z + 1. The parameter Z is a constant and it is often set to 1000. Here, χ r = r/Z and η is minimal stopping error.
We demonstrated that the density balance algorithm is superior to γ i for finding potential clustering centers. However, we need to compute five variables (ρ i , δ i , γ i , φ i , ψ i ) and the maximum interval to obtain the number of clustering centers, which increases the computational complexity of the AFCF. In this article, we will address the issue by employing graph clustering. Fig. 6 shows that the AFCF can achieve automatic image segmentation. The segmentation result depends on superpixel algorithms and the improved DP algorithm. However, the computation of decision graphs is complex since five variables are required. Both the DP algorithm and spectral clustering depend on affinity matrixes of samples, the former seeks density peaks and the later employs eigenvalue decomposition to achieve data clustering. Motivated by this, in this study, we try to use graph clustering instead of the DP algorithm to estimate the number of clustering center, which can avoid the computation of five variables. The idea is simpler and more efficient than the DP algorithm. The detailed analysis is presented in Section III.B.

III. METHODOLOGY
In Section II, we presented our motivations of this work. Here, we employ the improved AMR to generate better superpixel images with higher boundary accuracy, and use the graph clustering instead of the DP algorithm to achieve faster estimation of clustering parameters.

A. REGINAL MINIMUM REMOVAL USING MR
To remove useless regional minima that often cause oversegmentation, we proposed the AMR in [27]. The AMR can remove useless regional minima to improve the final segmentation result. The AMR denoted by ψ is defined as where g is a mask image and f is a marked image, b s ⊆ b s+1 ⊆ · · · ⊆ b m are a series of nested structuring elements, the parameter i denotes the scale of a structuring element, VOLUME 8, 2020 stands for pointwise maximum, R φ denotes morphological closing reconstruction. In practical image segmentation, the marked image is often defined as f = ε b i (g), where ε represents the elementary morphological erosion operation and f g. It can be seen that the parameter of initial structuring element s is empirical for the AMR. If the value of s is large enough, AMR will degrade and it will be equal to MR when the initial structuring element equals to the maximal one, i.e., s = m. Figs. 7 and 8 show the comparison of segmentation results using different values of s. To overcome this issue, we remove the parameter s of AMR, and thus present a simpler representation, i.e., The new representation can obtain segmentation results with higher boundary accuracy, but the segmentation result includes more small areas as shown in Fig. 7.
We can see that the regional minimum image includes many small connected components in Fig. 7(d). In practice, a connected component corresponds to a segmentation area. Fig. 7(e) shows connected components marked by different colors. It is obvious that we can remove small connected components to achieve region merging. We firstly present a theorem before addressing this issue.

Theorem 1:
Let g be a gradient image, I be the regional minimum image of g, W be the final segmentation result using watershed transform, I = (I 1 , where W j 1 ∪ W j 2 = ∅, 1 j 1 , j 2 n, j 1 = j 2 , x p is p-th pixel in W and x q is q-th pixel in the image I , and The Theorem 1 shows that small segmentation areas can be merged by removing smaller connected components in the image I . Fortunately, the regional minimum image is a binary image. For this type of images, geometrical shape information is more useful than grayscale information. This is the reason why morphological operators are more popular in binary images than grayscale or color images. Based on binary morphological operations and the Theorem 1, we can use the binary morphological reconstruction to remove smaller connected components and thus to achieve region merging on segmentation results. The proposed algorithm is named watershed transform based on AMR and regional minimum removal (AMR-RMR-WT). We remove connected components using the following formula, where k denotes the parameter of structuring elements. According to (9), it is easy to merge small regions in the image W by setting the value of the parameter k. Moreover, the larger is the value of k, more regions are merged (a larger k-means that more small areas are merged).
In the AMR, two stopping conditions are employed to speed up the execution of the algorithm. Three parameters s, m and η are required. In the AMR-RMR-WT, the parameter s is removed but k is a new parameter. However, the AMR-RMR-WT is superior to the AMR-WT due to the improvement in boundary accuracy. The Algorithm 1 shows the detailed explanation of AMR-RMR-WT.
end if 10: if J η then 11: break 12: end if 13: end for 14: I = regionalMin(ψ) 15 16: Compute watershed line to obtain W We applied Algorithm 1 to Fig. 7(a). Fig. 9 shows the comparison of segmentation results. Table 1 shows the number of segmentation areas provided by different algorithms.
By comparing Fig. 8(d) and Fig. 9(c) (bottom), we can see that the AMR-RMR-WT provides a better region merging effect than the AMR-WT. Although the AMR-WT (s = 3) corresponds to similar number of areas with the AMR-RMR-WT (k = 5) as shown in Table 1, the latter generates a better segmentation result with higher boundary accuracy. Fig. 9 and Table 1 further demonstrate the advantages of AMR-RMR-WT.

B. AUTOMATIC GRAPH CLUSTERING
In Section III.A, we described the principle of AMR-RMR-WT and its advantages. To further improve segmentation, we study automatic graph clustering based on superpixels provided by the AMR-RMR-WT in this Section.
Although many improved spectral clustering algorithms have been proposed [41,42,43,44], few of them focus on automatic spectral clustering. Some researchers employ the maximum intervals of eigenvalues [38] to estimate the potential clustering centers. However, it often suffers from  failures as shown in Fig. 10. Here, we introduce the eigenvalue gradient clustering (EGC) to improve the prediction accuracy of potential number of clusters.
Firstly, we analyze the eigenvalues of spectral clustering. As we employ AMR-RMR-WT to generate superpixel results, the corresponding data set is defined as V = {v 1 , v 2 , · · · v n }, and v j = 1 where ∂ j denotes the j-th region in a superpixel image, and v j is the average gray-scale value of pixels in ∂ j .
According to V , we can get the affinity matrix A ∈ R n×3 , and where σ 2 is the scaling parameter of A. Furthermore, we can compute degree matrix denoted by D, and the Laplacian matrix is defined as The eigenvalue set of A is λ = {λ 1 , λ 2 , · · · λ n } , λ 1 = 1 and λ 1 λ 2 · · · λ n . Generally, the first c eigenvalues and its corresponding eigenvectors are used for k-means clustering to obtain the final clustering result. However, it is difficult to set the value of c. By analyzing the eigenvalue distribution in Fig. 10, we can see that most of eigenvalues are small and a few of them are large, which indicates that there is a large number of redundancies in an image. How to remove redundant eigenvalues and preserve useful ones is a problem. Here, we use the idea of clustering to replace the maximal eigenvalue interval. Assume that eigenvalues of an affinity matrix could be grouped into three groups, where the first group is redundant and useless due to very small eigenvalues, the second group may be important and useful for classification since it has clearly larger values than the first group, and the last group is similar to the second group but it has higher values than the second. However, it will take a long execution time to perform clustering on eigenvalue sets due to many iterations. To decrease iterations and improve the clustering accuracy, we perform clustering on eigenvalue gradient sets. As eigenvalue gradients can reduce the number of different values in λ, it is easier to implement clustering on eigenvalue gradient sets than eigenvalue sets.
We perform FCM on λ g , the objection function is where y k represents the prototype value of the k-th cluster, u kj denotes the membership value of the j-th sample with respect to cluster k. U = [u kj ] c×(n−2) represents membership partition matrix. The parameter c is the number of clusters. The parameter m is a weighting exponent on each fuzzy membership that determines the amount of fuzziness of the classification results. Fig. 11 shows the comparison of eigenvalue clustering and the EGC. Table 2 shows the comparison of eigenvalue clustering and the EGC. Fig. 11(b) shows better clustering result than Fig. 11(a), which demonstrates that the EGC is superior to the eigenvalue clustering for finding potential clustering centers. In Table 2, the eigenvalue clustering requires more iterations than the EGC, and the former obtains larger variance of inter-class than the latter, which means that the latter provides more accurate classification results and requires fewer iterations.  if max U (t) − U (t−1) < η then 7: break. end if 11: end for 12: Sort y k in descending, y 1 y 2 y 3 13: Count the number of samples that belongs to the first two classes y 1 and y 2 . 14: Output c = C(y 1 ) + C(y 2 ) In the Algorithm 2, y k denotes the k-th clustering center, k 3, C(y 1 ) and C(y 2 ) denote the number of elements classified into y 1 and y 2 , respectively. We use the Algorithm 2 to compute new decision graphs and segmentation results as shown in Fig. 12. Comparing Fig. 10 and Fig. 12, it is clear that the EGC algorithm can provide accurate number of clusters.

C. AUTOMATIC IMAGE SEGMENTATION FRAMEWORK
In Section III.A, we presented the AMR-RMR-WT to remove the parameter s and improve the boundary accuracy, which is useful for achieving fast image segmentation. In Section III.B, we presented the automatic graph clustering algorithm based on the EGC, which is helpful for improving automatic image segmentation effect. Based on Sections III.A and III.B, we propose the overall image segmentation framework in this Section. The framework includes three stages, i.e., image superpixel, the parameter estimation of clustering, and spectral clustering as shown in Fig. 13.
Note that there are two superpixel images in Fig. 13; the first one is used for parameter estimation and the second one is used for pre-segmentation. Although there are two superpixel images, we only compute the AMR-RMR-WT once because the AMR-WT can provide hierarchical segmentation results.
According to Fig. 13, the proposed FAS-SGC has the following advantages: • The FAS-SGC is a fast algorithm due to two reasons.
The first is that the number of clustering samples is small because we use superpixels instead of the original image. Superpixels can simplify an image while maintaining its spatial structuring information. The second is that the EGC algorithm is fast for generating decision graphs since the computational complexity of EGC is low.
• The FAS-SGC is an automatic algorithm for image segmentation. It requires fewer parameters. In the first stage, there is no parameter except iteration stopping conditions. In the second stage, we only use basic parameters of FCM. Moreover, the number of clusters is a constant 3 for the EGC. The last stage only uses the parameter c provided by the EGC.
Based on the above analysis, we propose the detailed steps of FAS-SGC.
Step 2: Compute two affinity matrixes A 1 ji and A 2 ji in respect to two superpixel images according to (10)- (12).
Step 3: Implement Algorithm 2 EGC on A 1 ji to obtain the number of clusters c .
Step 5: Perform eigenvalue decomposition on L to obtain eigenvectors.
Step 6: Perform k-means on top c eigenvectors.
Step 7: Reshape labels and output the final segmentation result.
We perform the FAS-SGC on different images and Fig. 12(c) shows final segmentation results. In contrast to Fig. 10, the FAS-SGC obtains better segmentation results. Note that the segmentation process is fully automatic without human involvement.

IV. EXPERIMENTS
In Section III, we described the FAS-SGC in details. To demonstrate the efficiency and effectiveness of the FAS-SGC, we conducted experiments on a synthetic image, popular benchmark BSDS500 images [45], and a high-resolution scanning electron microscopy (SEM) image. We chose the benchmark including 500 images of size 321 × 481 or 481 × 321 to demonstrate the proposed method is effective for many different images. Remember that the BSDS500 has standard ground truth segmentations, which are convenient for the estimation of algorithm performance. Additionally, we choose high-resolution images with special application since high-resolution images are more and more popular in our life with the development of imaging technology.

A. PARAMETER SETTING
In our experiments, parameters are set followed by original articles. A window of size 3 × 3 is employed by algorithms such as HMRF-FCM, FLICM, KWFLICM and FRFCM, which require a neighboring window of fixed size for fair comparison. FRFCM requires two filters, both the structuring element and the filtering window are a square of size 3 × 3 [19]. The three parameters, namely, spatial bandwidth h s = 10, range bandwidth h r = 10, and minimum region area h k = 100 relating with Mean-shift, are used for Liu's algorithm [21], but h s = 7, h r = 7, and h k = 30 are used for FNCut [28] since different values are set followed the original articles. In SSFCM and AFCF, AMR-WT is employed to obtain superpixel image, where the radius of the started structuring element is 3, the minimal threshold error of AMR is 10 −4 . Except three indispensable parameters mentioned above and the number of the cluster prototypes, the HMRF-FCM, FLICM and KWFLICM do not require any other parameters.
Because some comparative algorithms are time-consuming, three indispensable parameters, the weighting exponent, the minimal threshold error, and the maximal number of iteration are set to 2, 10 −5 , and 50, respectively. For the proposed FAS-SGC, only two parameters are required, the radius of structuring element used for RMR is 3, the maximal structuring element follows AMR in [27], and the EGC adopts default parameters where the number of clusters is a constant 3. All the parameters are used in following experiments.

B. RESULTS ON SYNTHETIC IMAGES
To demonstrate the efficacy of the proposed FAS-SGC for noisy image segmentation, a symmetric image corrupted by mixed noise is considered as the test image. Fig. 14 shows the comparative segmentation results using different algorithms. Note that both AFCF and FAS-SGC are automatic image segmentation algorithms that can estimate accurately the number of clusters that is used for all comparative algorithms.
As can be seen, Figs. 14(c, d, e and h) obtain accurate clustering centers but contain a large number of pixels that are wrongly classified, which demonstrates that HMRF-FCM, FLICM, and DSFCM_N have limited capability of noise suppression due to the selection of fixed and small-size neighboring windows. Segmentation results in Figs. 14(g and i) obtain erroneous clustering centers and these segmentation results are completely wrong, which shows that KWFLICM, FRFCM, and FNCut are sensitive to mixed noise when they are used for color image segmentation. Figs. 14(f and j) show good area characteristic due to the employment of superpixel algorithms. Figs. 14(k and l) show that AFCF and FAS-SGC obtain similar and good segmentation results that are close to expectations, but the later provides more accurate details than  the former since the AMR-RMR-WT is superior to AMR-WT for superpixel images.
To evaluate the performance of different algorithms, we adopt four popular performance metrics: The Probabilistic Rand Index (PRI), the Covering (CV), the Variation of Information (VI), and the Global Consistency Error (GCE). Generally, a good segmentation corresponds to high values of PRI and CV, and corresponds to low values of VI and GCE. Table 3 shows the performance comparison on Fig. 14 according to four metrics PRI, CV, VI and GCE.
In Table 3, most of these algorithms obtain low values of PRI and CV, and high values of VI and GCE except AFCF and FAS-SGC. These two algorithms are state-of-theart for noisy image segmentation. Among these algorithms, the proposed FAS-SGC obtains the largest values of PRI and CV, the smallest values of VI and GCE. The performance of different algorithms is the same as visual effect of Fig. 14. Clearly, FAS-SGC is demonstrated to be insensitive to noise in image segmentation.

C. RESULTS ON BENCHMARK
The BSDS500 is popular for evaluating image segmentation algorithms since there are 4-9 ground truth segmentations for each image and each ground truth segmentation is delineated by one human subject. We performed comparative algorithms and the proposed FAS-SGC on the BSDS500. Note that both the AFCF and FAS-SGC are automatic and thus the number of clusters is unrequired for them. To compare different algorithms fairly, we firstly perform FAS-SGC on BSDS to obtain c that is used for all comparative algorithms except AFCF. Fig. 15 shows some segmentation results.
In Fig. 15, HMRF-FCM, FLICM, KEWFLICM, FRFCM, Liu's method, and DSFCM_N fail to segment image ''3063''. Algorithms FNCut, SFFCM, AFCF, and FAS-SGC obtain better segmentation results but FNCut misses boundary details due to the Mean-shift algorithm involves image filtering. We also see that though both SFFCM and AFCF generate better segmentation results than other comparative algorithms on image ''3063'', these results are worse than the result provided by the FAS-SGC. It is clear that the FAS-SGC obtains better boundary details than the SFFCM since the AMR-RMR-WT is superior to the AMR-WT, and the former obtains more accurate result than the AFCF since the EGC is superior to the DP algorithm. Similarly, the FAS-SGC generates better segmentation results for other test images. Even though the image ''134008'' has a very low contrast between object and background, the FAS-SGC obtains excellent segmentation result.
In Table 4, we can see that HMRF-FCM, FLICM, KWFLICM and FRFCM obtain similar PRI and CV. These algorithms obtain low values of PRI and CV since small neighboring windows are used for the integration of spatial information. The DSFCM_N obtains worse result since it employs sparse representation that is only effective for images corrupted by noise. Liu's algorithm, FNCut, and SFFCM obtain similar performance that is higher than ones obtained by HMRF-FCM, FLICM, KWFLICM and FRFCM due to the utilization of superpixel algorithm. The proposed FAS-SGC obtains the best performance in all test algorithms, which demonstrates that the FAS-EGC is able to provide good segmentation result for real images. In addition, the FAS-SGC is a fast algorithm for image segmentation, which will be illustrated in Section IV.E.

D. RESULTS ON HIGH-RESOLUTION SEM IMAGE
To show the proposed FAS-SGC is useful for some special images, we apply the FAS-SGC to a SEM image with very high-resolution 1278 × 892. SEM is an imaging device that generates a topological image of samples using a beam of electrons to achieve much higher spatial resolution than an optical microscopy [47]. The device is able to capture the surface morphology of samples and thus it is widely used in scientific research fields such as medical, biological, materials, chemical, physical, etc. [48], [49]. Generally, SEM can provide a range of magnification times varying from about 15 to 50000. Here, a SEM image of porous material is considered as the test image as shown in Fig. 16, where those dark areas denote holes and brighter areas denote connections. Researchers want to know the size and distribution of holes to analyze physical and mechanical properties of porous material. Traditionally, they firstly select one or two holes, and then compute the size of holes manually. It is obvious that the traditional method is loose and lacks statistical significance. We try to use image segmentation technology to obtain accurate data of hole distribution. We per-  formed comparative algorithms and the proposed FAS-SGC on the SEM image, segmentation results are shown in Fig. 16.
In this experiment, we also firstly performed the FAS-SGC to obtain the parameter c . Fig. 16 shows that all these algorithms can detect holes except the FNCut. Here, the results provided by HMRF-FCM, FLICM, KWFLECM, FRFCM, DSFCM-N include too many small areas because these algorithms are sensitive to noise. Liu's algorithm, SFFCM, AFCF, and FAS-SGC generate detection results due to the employment of superpixel algorithms.
To illustrate further these experimental results, Table 5 shows the performance comparison of different algorithms. We can see that the FAS-SGC obtains the best performance indexes. Although both AFCF and FAS-SGC are automatic image segmentation algorithms, they show better performance than other comparative algorithms.
In practical applications, researchers can obtain the data of hole distribution according to detection results. The data are important and significant for the analysis of material properties. Table 6 shows the average area of holes in Fig. 16. Note that Table 6 does not contain the data obtained by FNCut since it fails to detect holes on the SEM image. Furthermore, Fig. 17  shows the error comparison of average area among different algorithms. We consider the difference value between average area of holes in segmentation result and average area of holes in Ground truth as the error of average area. The FAS-SGC obtains the minimum error, which demonstrates that the FAS-SGC can obtain more accurate data of hole distribution on the SEM image than other comparative algorithms.

E. EXECUTION TIME
Execution time is often used to assess the practicability of a segmentation algorithm. In Sections IV.B-IV.D, we demonstrated that the proposed FAS-SGC was superior to comparative algorithms according to segmentation results. Here, we demonstrate the second advantage of FAS-SGC, i.e., a high computational efficiency. Table 7 shows execution time of different algorithms on the synthetic image as shown in Fig. 14(a), the BSDS500, and the SEM image as shown in Fig. 16(a).
In Table 7. We can see that the FAS-SGC takes the least time for different kinds of image, which shows the FAS-SGC has higher computational efficiency than comparative algorithms. In all comparative algorithms, HMRF-FCM, KWFLICM, Liu's algorithm, and DSFCM_N spend long time to achieve image segmentation since they compute neighboring spatial information in each iteration. The FNCut takes long time due to the learning of pairwise affinities. Because the spatial distance information of FLICM can be replaced by convolution operation, the improved code of FLICM is fast for image segmentation. The FRFCM is fast since the spatial neighboring information is computed in advance. Both SFFCM and AFCF are fast since they employ superpixel algorithms to reduce clustering samples. Moreover, the adaptive neighboring information is computed only once throughout the algorithm. The FAS-SGC is the fastest since it employs superpixel algorithms to reduce the size of affinity matrix, and uses graph clustering instead of the DP algorithm to estimate clustering parameters. In addition, we can see that the execution time of all algorithms is increasing with the increase of image resolution. The proposed FAS-SGC shows clearer advantage for high-resolution image segmentation.

V. CONCLUSION
In this article, we studied fast and automatic image segmentation using superpixel-based graph clustering (FAS-SGC). We firstly analyzed popular clustering-based image segmentation algorithms, and then found that parameter setting and computation complexity are two main issues that affect the performance of these algorithms. To address these two issues, we presented the AMR-RMR-WT that is able to provide better boundary accuracy than the AMR-WT, and presented the EGC algorithm that has a high computational efficiency for the estimation of clustering parameter. Finally, we described the achievement of the FAS-SGC in details and discussed its advantages.
We conducted three experiments to demonstrate the superiority of the proposed FAS-SGC. Three experiments show that the FAS-SGC has two clear advantages. One is that it obtains the best segmentation effect from the employment of AMR-RMR-WT and EGC. The other one is that it has the lowest computational complexity due to the employment of superpixel and graph clustering. Three experiments show that the FAS-SGC is effective for different types of image segmentation task.
Though the FAS-SGC is effective and efficient for image segmentation, and it is fully automatic without human-computer interaction, the FAS-SGC does not provide as good segmentation results as those obtained by supervised image segmentation algorithms. In the future, we will explore the combination of supervised learning and unsupervised learning algorithms to achieve weakly supervised image segmentation [50] .  He is currently a Reader with the Department of Electronic and Electrical Engineering, Brunel University London, U.K. His research interests include digital signal processing, affective computing, machine learning, human-computer interaction, and computer vision. He held academic positions in several universities, including the University of Oxford, U.K., the Imperial College London, U.K., University of Strathclyde, U.K., and University of Liverpool, U.K., and a Finland distinguished professor position with Jyväskylä University, Finland. In 2013, he moved to Brunel University London, U.K., to become the Chair and the Head of Electronic and Computer Engineering. He is a Distinguished Visiting Professor with Xi'an Jiaotong University, China, and an Adjunct Professor with the University of Calgary, Canada. In 1983, he co-discovered the three fundamental particles known as W + , W − , and Z 0 at the UA1 Team, CERN, providing the evidence for the unification of the electromagnetic and weak forces, for which the Nobel Committee for Physics in 1984 awarded the prize to his two team leaders for their decisive contributions. His current research interests include signal processing and machine learning, with applications to communications, image segmentations, and biomedical data. He has made many fundamental theoretical and algorithmic contributions to many aspects of signal processing and machine learning. He has much expertise in big and heterogeneous data, dealing with modeling, classification, estimation, and prediction. He has authored over 600