Robust Self-Sparse Fuzzy Clustering for Image Segmentation

Traditional fuzzy clustering algorithms suffer from two problems in image segmentations. One is that these algorithms are sensitive to outliers due to the non-sparsity of fuzzy memberships. The other is that these algorithms often cause image over-segmentation due to the loss of image local spatial information. To address these issues, we propose a robust self-sparse fuzzy clustering algorithm (RSSFCA) for image segmentation. The proposed RSSFCA makes two contributions. The first concerns a regularization under Gaussian metric that is integrated into the objective function of fuzzy clustering algorithms to obtain fuzzy membership with sparsity, which reduces a proportion of noisy features and improves clustering results. The second concerns a connected-component filtering based on area density balance strategy (CCF-ADB) that is proposed to address the problem of image over-segmentation. Compared to the integration of local spatial information into the objective functions, the presented CCF-ADB is simpler and faster for the removal of small areas. Experimental results show that the proposed RSSFCA addresses two problems in current fuzzy clustering algorithms, i.e., the outlier sensitivity and the over-segmentation, and it provides better image segmentation results than state-of-the-art algorithms.


I. INTRODUCTION
Clustering is an unsupervised classification algorithm and it aims to classify data into several disjoint subsets depending on data features [1]. As one of the important techniques of data analysis and data mining, clustering has been widely used for text classification, biometric features recognition, image segmentation, etc. In these applications, image segmentation based on clustering are very popular and a large number of algorithms have been proposed and used in the fields of medicine [2], remote sensing [3], intelligent transportation [4], etc. Popular clustering algorithms mainly include hierarchical clustering [5], spectral clustering [6], fuzzy clustering [7], affinity propagation clustering [8], density peak clustering [9], etc. In this paper, we mainly focus The associate editor coordinating the review of this manuscript and approving it for publication was Byung Cheol Song . on fuzzy clustering based on object function optimization for image segmentation.
Fuzzy c-means clustering (FCM) often suffers from two problems for image segmentation: firstly, it is sensitive to noise [10]; secondly, it lacks the local spatial information of images leading to over-segmentation [11]. To address these drawbacks, researchers proposed many improved fuzzy clustering algorithms. They mainly adopted two strategies; one is to utilize regularization approaches to achieve self-optimization of FCM, the other one is to incorporate local spatial information into the object functions of FCM to improve image segmentation.
To improve the robustness of FCM against noise, a lot of improved FCM algorithms based on self-optimization have been presented. Ahmed et al. [12] proposed bias correction FCM (BCFCM) that integrates a bias field into its objective function. Although the bias field is able to correct some pixels corrupted by noise, it shows low robustness for different kinds of noisy images since the bias field is often not sparse. Based on this work, Zhang et al. [13] proposed a deviation-sparse fuzzy c-means (DSFCM) that utilizes regularization with sparsity constraint to mitigate significantly the defect of BCFCM. Though DSFCM is able to provide accurate clustering centers, it is sensitive to regularization parameters and thus shows low robustness. Motivated by the Network lasso [14], Guo et al. [15] further explored the affinity-based regularization and proposed the membership affinity lasso regularized fuzzy c-means clustering (MalFCM). The MalFCM generates better classification results than DSFCM via building an affinity matrix, but MalFCM requires high computational complexity since the alternating direction method of multipliers (ADMM) [16] is used to optimize the affinity matrix. However, both DSFCM and MalFCM depend on a condition that the sum of membership values of each pixel is equal to 1. To loosen this condition, Pal et al. [17] proposed possibilistic FCM (PFCM) by considering memberships and possibilities, which hybridizes FCM and possibilistic c-means [18] with looser constraint on membership. Although PFCM can mine more informative descriptions of data, it lacks robustness against data with non-spherical distribution. To settle the shortcoming, Bai et al. [19] proposed similarity measure-based PFCM. Continuing in direction of membership, Zhou et al. [20] reported a membership scaling FCM (MSFCM) based on triangle inequality. The MSFCM not only effectively improves the convergence speed of the model but also maintains the accuracy of data clustering.
In most improved FCM algorithms mentioned above, the fuzzifier exponent is often considered as a constant and its value is usually equal to 2. However, in practical applications, the change of the value of fuzzifier exponent easily leads to different segmentation effect [21], [22]. To address this problem, Miyamoto and Mukaidono [23] proposed maximum entropy FCM (MEFCM) that avoids the selection of the value of fuzzifier exponent and simplifies its calculation. Inspired by MEFCM, both [24] and [25] introduced relative entropy and kernel distance to improve their objective function. However, these FCM algorithms based on entropy theory do not consider the sparsity of memberships, which leads to the misclassification for outliers. Recently, Xenaki et al. [26], Koutroumbas et al. [27] verified a conclusion that the sparsity of fuzzy memberships can deal well with closely located clusters. Similarly, Xu et al. [28] also demonstrated that the sparsity can avoid performance degradation of clustering algorithms. These previous reports show that fuzzy memberships with sparsity can positively improve clustering results. However, it is still a challenge to obtain reasonable sparsity of fuzzy membership. Moreover, it is difficult to apply those algorithms mentioned above to image segmentation due to the lack of local spatial information.
To reduce over-segmentation caused by fuzzy clustering algorithms, the local spatial information of images is usually incorporated into their objective functions [29]- [33].
The utilization of local spatial information can be grouped into two categories: obtaining the neighboring information in a fixed-size window or in an adaptive neighboring window. Early improved FCM algorithms usually employ a fixed-size window of size w × w to obtain local spatial information such as FCM_S [12], FCM_S1 [29], FCM_S2 [29], FLICM [30], KWFLICM [33], DSFCM_N [13] etc. Similar to the above algorithms, recently, Mishro et al. [34] proposed a novel type-2 adaptive weighted spatial FCM (AWSFCM) clustering algorithm that employs a fuzzy linguistic fuzzifier and spatial information of membership to reduce misclassification of pixels. However, in practical applications, a large window usually leads to rich spatial information but high computational cost; on the contrary, a small window leads to low computational cost but limited spatial information. Therefore, a middle-size window like 3 × 3 or 5 × 5 is popular for these algorithms since the window achieves a balance between spatial information and computational cost. To overcome the limitation of fixed-size windows, some researchers employ superpixel techniques to obtain adaptive neighboring information such as Liu's algorithm [35], FDCM_SSR [36], SFFCM [37], AFCA [38], etc. Although the second strategy improves segmentation accuracy due to the utilization of adaptive neighboring information, these algorithms seriously depend on the selection of superpixel algorithms [39]- [41]. A good superpixel algorithm not only improves the final segmentation result, but also reduces computational cost efficiently. However, the segmentation result will be worse than one obtained by FCM if the superpixel algorithm is unsuitable for images to be segmented. In particular, superpixel algorithms are unavailable for images with low-contrast or blurred edges.
For the reduction of over-segmentation, image filtering is also a popular strategy. In [42], Szilagyi et al. proposed enhanced FCM (EnFCM) by introducing histogram to its objective function and applying a local linear-weight filtering to each pixel. As the number of gray-levels is generally much smaller than the number of pixels in a grayscale image, the EnFCM achieves high computational efficiency in image segmentation. On the basis of this work, Cai et al. [43] proposed the fast generalized FCM (FGFCM) by integrating a bilateral filter to its objective function. Furthermore, Zhao et al. [44] presented a FCM algorithm with self-tuning non-local spatial information [45]. However, the computation of non-local spatial information is time-consuming. Inspired by FGFCM, Guo et al. [46] proposed a noise detecting FCM (NDFCM) with auto-tuning parameters by measuring local variance of gray levels. More recently, Lei et al. [47] proposed a fast and robust FCM (FRFCM) based on morphological reconstruction and membership filtering. FRFCM achieves good segmentation results and requires short execution time for different kinds of grayscale images.
According to those studies mentioned above, currently popular fuzzy clustering algorithms still suffer from two challenges for image segmentation tasks. The first is that these algorithms lack immunity to outliers due to the non-sparsity of fuzzy memberships. The second is that it is difficult to overcome effectively over-segmentation for them. To solve these challenges, we employ a novel regularization to obtain sparse fuzzy memberships, and use a connected-component filtering algorithm based on area density balance strategy (CCF-ADB) to achieve region merging adaptively, and finally propose a robust self-sparse fuzzy clustering algorithm (RSSFCA) for image segmentation. The proposed RSSFCA 1 has two advantages: • The RSSFCA utilizes regularization under Gaussian metric to obtain proper sparse memberships that can effectively reduce non-homogenous interference and achieve better classification than popular self-optimized FCM algorithms.
• The RSSFCA employs the CCF-ADB to achieve automatically region merging, which is superior to the strategy of incorporating local spatial information and helps RSSFCA to achieve better image segmentation than improved FCM algorithms within local spatial information. The organization of this paper is presented as follows. We first discuss our motivations in Section II. Then, we introduce the proposed algorithm and analyze its advantages in Section III. Thirdly, we conduct experiments and discuss experimental results in Section IV. Finally, we provide conclusion in Section V.

II. MOTIVATION
To improve fuzzy clustering algorithms for image segmentation, researchers usually adopt two strategies. One is introducing regularization terms into objective functions to improve clustering results, the other one is integrating local spatial information into object functions to overcome oversegmentation. As classification results of pixels are decided by fuzzy memberships, a good objective function should meet the constraint of sparsity of fuzzy memberships. However, it is difficult to obtain sparse fuzzy memberships for popularly self-optimized FCM algorithms such as DSFCM and MalFCM. Motivated by this, we introduce a new regularization to the objective function of FCM. Most of improved FCM algorithms employ a fixed-size window or adaptive neighboring windows to overcome over-segmentation, but the utilization of local spatial information often leads to smooth boundaries or high computational cost. In order to address the problem, we present a connected-component filtering based on area density balance strategy to achieve adaptive region merging, which can maintain boundary accuracy and requires low computational cost.
A. SELF-OPTIMIZED FCM Let X = {x 1 , x 2 , . . . , x n } ∈ D×n be an unlabeled data set, and we aim to spilt X into c disjoint clusters, with the corresponding clustering centers V = {v 1 , v 2 , . . . , v c }. The FCM algorithm [7] uses Lagrange multiplier technique 1 Source code is available at https://github.com/SUST-reynole/RSSFCA to find optimal solutions with respect to sum of squared Euclidean distance error. The objective function of FCM is defined as: where c denotes the number of clusters, n denotes the number of samples, u ij is the degree of membership of x j with respect to the clustering center v i , 0 ≤ u ij ≤ 1 and c i=1 u ij = 1, m is the fuzzification exponent for the partition matrix u ij , and · represents the Euclidean norm.
It is well-known that FCM is sensitive to noise and outliers in data, and it lacks robustness for non-spherical data clustering due to the employment of square loss in (1). Regularization-based FCM algorithms address the problem through adding a constraint, i.e., a regularization term, to their objective functions. The regularization term is able to help FCM to improve clustering accuracy for data corrupted by noise or non-spherical data. Zhang et al. [13] introduced the deviation-sparse to FCM (DSFCM) to construct a novel objective function that is defined as: where e j is the deviation between x j and its the theoretical value, and λ is the regularization scalar. It is easy to obtain a sparse matrix E = {e 1 , e 2 , . . . , e n } ∈ D×n by using softthresholding [48]. The DSFCM can improve classification accuracy for noisy data. On the one hand, it uses large deviation values of e j to revise x j corrupted by noise, and on the other hand, it uses x j with small value to maintain the original x j uncorrupted by noise. Therefore, the DSFCM can improve the robustness of FCM for noisy data clustering because of the introduction of regularization term into its objective function. However, it is difficult to set the value of λ for different kinds of data. Inspired by the Network lasso [14], Guo et al. [15] developed a new regularization of membership affinity lasso (MalFCM) and presented the objective function as follows: where w jk is the affinity of the jth point x j to the kth point x k . The MalFCM employs the alternating direction method of multipliers (ADMM) [16] to optimize the membership affinity lasso. Consequently, it achieves accurate classification for complex data due to the consideration of membership similarity. However, the MalFCM requires high computational cost because ADMM is time-consuming. Noted that both DSFCM and MalFCM require to set the value of parameter m. To reduce the influence of parameters, Miyamoto and Mukaidono [23] integrated a penalty term of maximum entropy (MEFCM) into its objective function: Here, the entropy term works as the degree of fuzzifier. As a result, the MEFCM not only avoids the parameter setting on fuzzification exponent m, but also minimizes the intra-class dispersion and maximizes the inter-class negative weight entropy. However, it is still difficult to set the value of λ for MEFCM.
Regularization-based FCM algorithms can improve clustering results for complex data, but they face two difficulties for complex non-spherical data. The first is that the Euclidean distance shows poor robustness for non-spherical data, and the second is that fuzzy membership is non-sparse, easily leading to the ambiguity for identified data. To address the difficulties, we use the Gaussian distribution instead of the Euclidean distance, and integrate the appropriate regularization term to our objective function to make sure that fuzzy memberships are sparse. We will present detailed description of the proposed strategy in Section III. A.

B. FCM WITH LOCAL SPATIAL INFORMATION
Image segmentation algorithms based on FCM easily cause over-segmentation since FCM ignores the local spatial information of images. To overcome the drawback, a large number of improved FCM algorithms have been proposed by incorporating local spatial information into their objective functions [29][30][31][32][33]. These algorithms can be categorized into two groups: The first group requires long execution time due to the computation of local spatial information in each iteration, while the second group shows high computational efficiency since the local spatial information is computed only once before iterations. Generally, the objective function of the first group can be abstracted as follows: where G ij denotes fuzzy factor that is used to balance noise suppression and the edge detail preservation. The value of G ij depends on pixels within neighboring windows and it can be changed in each iteration. If x j is a pixel corrupted by noise, then it will be replaced with G ij to reduce the influence of outliers. Therefore, the introduction of constraint terms into objective functions improves the noise tolerance and outliers resistance for image segmentation, and different forms of G ij lead to variously improved FCM algorithms, such as FCM_S, FCM_S1/S2, FLICM, etc.
Although those algorithms mentioned above can improve image segmentation effect, they are impractical due to high computational complexity. To reduce the computational cost, researchers designed fast FCM algorithms that avoid the redundant computation of local spatial information. The objective function of fast FCM algorithms is defined as follows: where n denotes the number of gray-level of test images, and n n. The x j denotes the jth pixel in the filtered image. We can define x j = F(x r∈j ), where F represents a filter, x r stands for a pixel fallowing into the neighboring window of x j . Based on different filters, a lot of FCM algorithms used for fast image segmentation are proposed such as EnFCM, FGFCM, NDFCM, FRFCM, etc.
It is clear that the introduction of local spatial information is very popular for improving image segmentation effect, but it suffers from high computational cost. Although many fast FCM algorithms are proposed by incorporating gray-level histogram or superpixels into their objective functions [37], it is still a challenge to obtain excellent segmentation results. For this problem, Comaniciu and Meer [39] presented an auxiliary strategy in mean-shift algorithm, namely eliminating spatial regions containing the number of pixels less than M , where M is a threshold. Inspired by this idea, we propose the CCF-ADB to help FCM to overcome over-segmentation. The CCF-ADB is simple and effective, it is superior to the strategy of integrating local spatial information, and achieves automatic region merging. We will comprehensively describe the CCF-ADB in Section III. B.

III. METHODOLOGY
In this study, we propose a novel self-sparse fuzzy c-means clustering algorithm (SSFCA) based on regularization approaches to obtain suitable sparse fuzzy memberships. Meanwhile, we utilize the CCF-ADB to merge useless small regions. The proposed RSSFCA can effectively overcome outlier sensitivity and over-segmentation, and thus improve segmentation results.

A. SELF-SPARSE FUZZY C-MEANS
Based on the analysis in Section II. A, DSFCM, MalFCM, and MEFCM cannot obtain sparse fuzzy memberships. To address this issue, we introduce a novel regularization approach by considering u 2 ij as a penalty term. We define the objective function of SSFCA as: where (x j |v i , i ) represents the distance function between x j and v i , γ is a balance factor used for controlling the sparsity of memberships. By changing the value of γ , the objective function shows different degrees of robustness to outliers or noises. In (7), if the fuzzy membership is sparse, the first term of J will be small while the second term will be large. The SSFCA often requires more iterations than k-means but fewer iteration than FCM for optimal computation. It is clear that VOLUME 8, 2020 the novel objective function J achieves a balance between kmeans and FCM. The obtained fuzzy membership is more sparse than that provided by FCM. In contrast with k-means, some fuzzy membership values are not 0 or 1. Besides, (x j |v i , i ) is defined as: where ρ(x j |v i , i ) is the Gaussian density function, and it is defined as: where D denotes the dimension of input data, i denotes covariance matrix that describes the intra-class dispersion of the ith class. Substituting (9) into (8), we obtain: Note that the distance metric in (10) is different from the Mahalanobis distance, since the former includes the variable ln| i | that may be a large negative value for dense distribution data. The (x j |v i , i ) may be unsatisfied for the constraint of nonnegative values due to the effect of ln| i | as shown in Fig. 1. In Fig. 1(a), there are three groups of data provided by (9) with parameters v 1 = 0, and 1 = 0.36; v 2 = 0, and 2 = 0.16; and v 3 = 0, and 3 = 0.04. It can be easily found that a smaller value of covariance corresponds to a compact curve, i.e., compact data distribution. Fig. 1(b) shows the result provided by (10). Note that the blue curve includes negative values, which violates the requirement of positive values on distance measure metric. With the decrease of the covariance value, more negative values lead to serious errors in distance measure and misclassification. To solve this problem, we use ( Fig. 1(c) shows that (x j |v i , i ) satisfies the non-negative constraint of distance. Substituting (x j |v i , i ) into (7), the final objective function is defined as: For each sample x j , the J can be separated into c sub-problems with constraint conditions 0 ≤ u ij ≤ 1 and c i=1 u m ij = 1. Then we get through simplification, J j can be rewritten as: where h ij = − (x j |v i , i )/2γ . Utilizing the optimization strategy proposed in [49] to solve (14), we obtain fuzzy memberships with different degrees of sparsity by tuning the value of γ . Similarly, the J i can be separated into n independent sub-problems to obtain the clustering center v i by solving Furthermore, the updated covariance matrix i is obtained by solving To reduce the number of iterations, the membership matrix, clustering centers, and the covariance matrix are initialized by the FCM algorithm. We summarize the procedure of the proposed SSFCA as follows: (1) Set the number of clusters c, regularization parameter γ , convergence threshold η, and maximum iteration number T . (2) Initialize the membership U (0) , the clustering centers V (0) , and the covariance matrix (0) using the FCM algorithm.  (14), (15), and (16), respectively. (5) Update the objective function J (t) using (12).
update t = t + 1, and go to step 4.
To demonstrate the effectiveness of SSFCA, we apply the SSFCA to data clustering with outliers. The test data is generated by sampling pixels at regular grid in an image ''113044'' from Berkeley Segmentation Dataset (BSDS500) [50]. Fig. 2 shows the comparison of clustering results provided by different algorithms. It can be easily seen that the SSFCA with γ = 0.2 achieves the best clustering result, which demonstrates that SSFCA is more robust for outliers than comparative algorithms based on regularization. Fig. 3 further illustrates the effectiveness of the SSFCA on image segmentation using heatmap visualization. It is clear that MEFCM, DSFCM and SSFCA attract more attention on ''horses'' than FCM. Simultaneously, the SSFCA achieves the best result due to the Gaussian metric and self-sparse optimization. Note that we do not provide the result generated by MalFCM since the algorithm requires to construct an affinity membership matrix that is very large for the image ''113044''.

B. OVER-SEGMENTATION REDUCTION
FCM uses pixel classification to achieve image segmentation, where each pixel of images is viewed as an independent sample. Therefore, FCM often causes the problem of over-segmentation, i.e., segmentation results include a large number of isolated small areas as shown in Fig. 4. It can be seen that DSFCM misclassifies more pixels than FCM and MEFCM because of inappropriate sparse deviation. Although the SSFCA partly suppresses the interference of non-homogenous pixels and obtains better visual effect, it still suffers from over-segmentation as shown in Fig. 4(e). Improved FCM algorithms based on the incorporation of local spatial information can alleviate over-segmentation by removing small areas, which is insufficient as shown in Fig. 5.
Although the mean-shift can effectively alleviate oversegmentation by eliminating small regions containing pixels less than M , the value of M is often adjusted manually for different images. In this work, we propose a novel CCF-ADB to improve the SSFCA. Fig. 6 shows the framework of the CCF-ADB. It is clear that Fig. 6(b) includes many useless small regions that reduces the final segmentation accuracy. According to the proposed CCF-ADB, we firstly compute the area of all connected components in Fig. 6(b), and then sort these connected components in descending order as shown in Fig. 6(c). Because it is difficult to obtain the value of threshold M depending on the sorting result, the ADB strategy is used to improve the sorting result as shown in Fig. 6(d). Based on the improvement, it is easy to obtain the maximum interval of Fig. 6(d). The maximal interval corresponds to a region whose area is considered as the value of M . Then we can eliminate small connected components as shown in Fig. 6(e) and (f) using the obtained M . Finally, Fig. 6(g) shows the segmentation result using the CCF-ADB.
To illustrate the strategy of ADB in details, let the χ p denote the pth point, where 0 ≤ χ p ≤ 1 and 1 ≤ p ≤ (K +1), p ∈ N + . The α q denotes the normalized area of the qth region, the ξ p denotes the number of α q around χ p under radius ε, which is presented as follows   where Q is the number of connected components in an image generated by the SSFCA. The κ q is the mapping result of α q , where κ q can be computed as follows: We perform the normalization on κ q to obtain κ q . More details about the density balance algorithm can be seen in [51]. We present the detailed description of the CCF-ADB as follows: Input: A connected-component image generated by the SSFCA.
Step 4: Implement the normalization on κ q to obtain κ q .
Step 5: Compute the value of cutoff area M using the maximal interval in κ q .
Step 6: Merge regions whose areas are smaller than the value of M using the minimum distance in connectedcomponent images.
Output: A labeled image containing fewer regions than the input image.
According to the CCF-ADB, we compute decision-graph and improved decision-graph as shown in Fig. 7. By comparing Fig. 7(b) and (c), the ADB provides accurate maximum intervals and thus helps the CCF to improve segmentation results from the SSFCA.

IV. EXPERIMENTS
To demonstrate the advantages of the proposed RSSFCA for image segmentation. We conducted experiments on synthetic images and two benchmark images. The first experiment demonstrates the superiority of RSSFCA on image with significant noise. The latter two experiments demonstrate the practicality and robustness of RSSFCA on different images from the Berkeley Segmentation Dataset (BSDS500) [50] and the Microsoft Research in Cambridge (MSRC) [52].

A. PARAMETERS SETTING
In our experiments, PFCM, KWFLICM, NDFCM, FRFCM, Liu's algorithm, DSFCM_N, MSFCM and AWSFCM are considered as comparative algorithms. The fuzzy exponent is m = 2, convergence threshold is η = 10 −5 , and maximum iteration number is T = 50. Most of comparative algorithms employ local spatial information except PFCM, MSFCM and RSSFCA, the size of neighboring window is 3 × 3. For PFCM, the relative importance of membership and typicality are both 1, and the exponent of typicality is selected as 2. In addition, both KWFLICM and MSFCM are free of other parameters. For the NDFCM, the values of spatial weighting factor, gray-level weight factor and controlled scale factor are set to 3, 5, and 3, respectively. Liu's algorithm uses the mean-shift to generate pre-segmentation results in which three parameters are necessary, i.e., spatial bandwidth h s = 10, the range bandwidth h r = 10, and the minimum output regions M = 100 following the original paper. We use default values of parameters for the FRFCM and DSFCM_N. For AWSFCM, we also referred to the paper and chose the number of α-planes as 3, which can further reduce execution time. The proposed RSSFCA requires a regularization parameter; we set γ = 0.2. All test algorithms  are executed on a DELL desktop with Intel (R) Core (TM) CPU, i7-6700, 3.4GHz, 16GB RAM.

B. SYNTHETIC IMAGE
To demonstrate the effect of the proposed RSSFCA on image segmentation. We firstly apply testing algorithms on a synthetic image corrupted by significant outliers. Fig. 8 shows comparative segmentation results. It can be seen that the synthetic image includes four clear objects and a lot of useless small objects as shown in Fig. 8(a). Our purpose is to segment four clear objects while suppressing useless small objects. Due to the fact that both PFCM and MSFCM ignore spatial information of images, their segmentation results contain more interference points as shown in Fig. 8(b) and (h). Focusing on Fig. 8(d), (f), and (i) generated by the NDFCM, FRFCM and AWSFCM, respectively, we find that these segmentation results still include many small objects that influence the final segmentation accuracy. Both KWFLICM and DSFCM_N provide misclassified results as shown in Fig. 8(c) and (g). Compared to other algorithms, Liu's algorithm can achieve better noise suppression and the removal of useless small region as shown in Fig. 8(e), since it employs the mean-shift algorithm to obtain better adaptive neighboring spatial information.
Although Fig. 8 shows that better spatial information corresponds to better segmentation results, Liu's algorithm shows limited capability for the synthetic image with complex background. The proposed RSSFCA shows the best segmentation result since four objects are segmented accurately as shown in Fig. 8(j). In general, the RSSFCA not only suppresses noise effectively, but also merges useless small regions to obtain excellent segmentation result.
To estimate the performance of all testing algorithms on Fig. 8(a), we use two indices, i.e., the segmentation accuracy (SA) and the quantitative index score (S). The SA and S are computed as follows: where the A i represents a segmentation result, the G i denotes the corresponding Ground Truth, the c is the number of clusters and n is the total number of pixels of images. Ideal segmentation results correspond to high values of SA ( = 1) and S ( = 1). Table 1 shows values of SA and S for Fig. 8.
In Table 1, both PFCM and MSFCM provide lower values of SA and S, which is consistent with the visual effect of Fig. 8(b) and (h). Both KWFLICM and DSFCM_N provide low values of S, which shows that these two algorithms cannot achieve accurate object segmentation for the synthetic   image. The NDFCM, FRFCM and AWSFCM provide similar values of SA and S due to the employment of same neighboring windows. Liu's algorithm obtains higher values of SA and S than previous algorithms but lower values than the RSSFCA. Table 1 further demonstrates that the RSSFCA outperforms comparative algorithms since it integrates a regularization under Gaussian metric into its objective function and uses the CCF-ADB to optimize the final segmentation result.

C. BENCHMARKS
In this section, we mainly validate the effectiveness of the proposed RSSFCA on BSDS500 [50] and MSRC [52]. The BSDS500 includes 500 natural images with size of 481×321 or 321 × 481, and each image corresponds to 4-9 manually generated ground truths with accurate pixel-wise labels. These ground truths are delineated manually by different human subjects. The MSRC collects 591 images with size of 320 × 213 or 213 × 320 and covers 23 object classes.
In addition, all testing images have been transformed from color space RGB to CIELAB. Figs. 9-10 show the comparison of segmentation results using different algorithms on BSDS500. According to Figs. 9-10, both PFCM and MSFCM generate poor segmentation results including too many small regions since they cannot overcome the sensitivity to intensity nonuniformity. Although both KWFLICM and DSFCM_N reduce the number of small regions by employing a neighboring window, they easily cause mis-segmentation as shown in Figs. 9-10. The AWSFCM shows low robustness since it is only valid for the first test image as shown in Fig. 10. However, the utilization of local spatial information may cause the loss of image details, or even wrong clustering results for some images. Although both NDFCM and FRFCM obtain better segmentation results than KWFLICM, DSFCM_N and AWSFCM by using improved image filtering approaches, they still suffer from over-segmentation. Liu's algorithm further improves segmentation results by incorporating large region-level information into its objective function, which avoids the limitation of small and fixed neighboring windows. Compared to previous algorithms, the proposed RSSFCA generates better segmentation results due to its property of self-sparsity and the utilization of CCF-ADB. Fig. 11 shows comparison of segmentation results on MSRC. It is clear that segmentation results in Fig. 11 show better visual effect than Figs. 9-10 due to simpler background of MSRC than that of BSDS500. As the same as Figs. 9-10, the proposed RSSFCA provides the best segmentation result, which further demonstrates the proposed RSSFCA outperforms comparative algorithms for image segmentation.
To evaluate segmentation performances of different algorithms, four performance measures [47], namely, probabilistic rand index (PRI), coving (CV), variation of information (VI), and global consistency error (GCE), are used in our experiments. If a segmentation result is close to its corresponding Ground Truth, then the values of PRI and CV will be large while the values of VI and GCE will be small.
In our experiments, the number of clusters c is set from 2 to 6 for each image in BSDS500, while it is set from 2 to 4 for each image in MSRC. We chose the group of measures that corresponds to the largest value of PRI as the final performance measures for each image. Tables 2-3 show the average values of PRI, CV, VI and GCE on all images in the BSDS500 or in MSRC, respectively. By comparing the values in Tables 2-3, PFCM provides low performance measures because it is sensitive to parameter tuning. Both Liu's algorithm and AWSFCM obtain better values of PRI, CV, VI, and GCE than those of other comparative algorithms. The proposed RSSFCA provides the best values of PRI, CV, VI, and GCE. By analyzing Figs. 9-11 together with Tables 2-3, the proposed RSSFCA achieves high-quality image segmentation on different benchmark images, which further demonstrates the effectiveness and robustness of RSSFCA.

D. COMPUTATIONAL COMPLEXITY
Computational complexity of algorithms plays an important role in practical applications. Table 4 shows the comparison of computational complexity of different algorithms, where n is the number of pixels of an image, c is the number of clusters, t is the number of iteration, w is the size of local window, l is the number of gray levels of image, and the value of l is close to the value of n for color images, O(M (c)) is the computational complexity of Newton's method for each iteration [49]. Table 4 shows the comparison of computational complexity of different algorithms. According to the Table 4, PFCM requires higher computational burden than MSFCM due to PFCM needing the parameter initialization by using FCM with t iterations. The KWFLICM, DSFCM_N, and AWSFCM require high computational complexity since these algorithms compute local spatial information of images in each iteration. In addition, K denotes the number of α-planes for AWSFCM. Different from the previous five algorithms, as both NDFCM and FRFCM only compute the local spatial information of images once in the process of image segmentation, they have low computational complexity. Because the SSFCA involves two computational steps, its computational complexity is composed of O(nct ) and O(n(M (c) + c)t). The RSSFCA includes SSFCA and CCF-ABD. As the time complexity of SSFCA is much larger than CCF-ABD, the time complexity of RSSFCA is close to the SSFCA. Therefore, RSSFCA avoids the heavily computational complexity because of linear iterative scheme.

E. DISCUSSION
In this section, we mainly discuss the influence of regularization parameter γ for the sparsity of membership matrices. Fig. 12 shows the comparison of membership matrices by tuning the value of γ on the image ''12003'', where the number of clusters is c = 2. We obtain the sparsest membership VOLUME 8, 2020  matrix that is close to k-means when the value of γ is close to zero. With the increase of the value of γ , more values of membership are close to 0.5. All values of membership equal to 0.5 when the value of γ equals to 1.
To discuss the effectiveness of the parameter γ , two validity functions [53], i.e., the partition coefficient denoted by (V pc ) and the partition entropy denoted by (V pe ) are used for performance evaluation. The V pc and V pe are defined as: where both V pc and V pe are intuitive indicators used to measure the sparsity strength of membership matrices. If the fuzziness of membership matrices is weaker, the value of the V pc will be larger while the value of the V pe will be smaller, and vice versa. Fig. 13 shows the change of validity functions by varying the value of γ , which shows the performance change of the SSFCA. In Fig. 13, it is observed that both   V pe and V pc are constant when γ > 0.5 because all values of membership are non-sparse and they are equivalent. The parameter γ can control the degree of sparsity only when γ ≤ 0.5. Therefore, we experientially set the value of γ to 0.2.

V. CONCLUSION
In this work, we have proposed a novel self-sparse fuzzy c-means clustering algorithm (RSSFCA) for image segmentation. The proposed RSSFCA has been used to address two drawbacks in current image segmentation algorithms based on fuzzy clustering. On the one hand, the RSSFCA incorporates a regularization term to its objective function to balance the sparsity of membership and fuzziness, and it thus achieves self-sparse fuzzy clustering. On the other hand, the RSSFCA employs the CCF-ADB to achieve effectively small region merging, which leads to excellent image segmentation results. Experiments on synthetic images and benchmark images demonstrate the proposed RSSFCA outperforms state-of-the-art algorithms in terms of better segmentation results and higher values of performance indices. Furthermore, we analyzed the influence of regularization parameter values on clustering results, which shows that the proposed RSSFCA can provide good clustering results when the value of γ is less than 0.5.  He held academic positions in several universities, including Oxford, U.K., Imperial College London, U.K., Strathclyde, U.K., and Liverpool, U.K., as well as a Finland Distinguished Professorship in Jyvaskyla, Finland. He is currently a Professor with the Department of Electronic and Computer Engineering, Brunel University London, U.K. He is a Distinguished Visiting Professor with Tongji University, China, and an Adjunct Professor with the University of Calgary, Canada. He has made many fundamental theoretical and algorithmic contributions to many aspects of signal processing and machine learning. He has much expertise in big data, dealing with estimation, prediction, clustering, classification, and so on. He co-discovered the three fundamental particles known as W + , W − and Z 0 (by the UA1 team at CERN), in 1983, providing the evidence for the unification of the electromagnetic and weak forces, for which the Nobel Committee for Physics, in 1984, awarded the prize to his two team leaders for their decisive contributions. He has authored over 600 technical publications, including 250 journal articles as well as five  (Springer, 1996). The H-index of his publications is 77 (Google Scholar) and ERDOS number is two. His current research interests include signal processing and machine learning, with applications to communications, gene expression data, functional magnetic resonance data, and biomedical data.
Dr. Nandi is a Fellow of the Royal Academy of Engineering, U.K., and seven other institutions, including IET. He received the Institute of