Segmentation of Pulmonary Nodules Using a GMM Fuzzy C-Means Algorithm

Segmentation of pulmonary nodule in thoracic computed tomography (CT) plays an important role in the computer-aided diagnosis (CAD) and clinical practices. However, segmentation of pulmonary nodules still remains a challenging task due to the presence of intrinsic noise, low contrast, intensity-profile inhomogeneity, variable sizes and shapes. Many variants and extensions of fuzzy C-mean (FCM) clustering algorithm have been developed to preserve image details as well as suppress image noises. However, these variants overemphasize the importance of the spatial information and neglect the role of the prior knowledge. To address this problem, a GMM fuzzy C-means (GMMFCM) algorithm is proposed for the segmentation of pulmonary nodules in this paper. A novel local similarity measure is defined by using local spatial information and GMM statistical information. A neighboring term is added to the energy function of traditional fuzzy C-mean algorithm. A superpixel-based random walker is proposed to segment pulmonary parenchyma, which reduces the computational complexity and improves the segmentation performance. Experiments performed on the LIDC dataset and the GHGZMCPLA dataset demonstrate that the segmentation performance of proposed GMMFCM algorithm is superior to the state-of-the-art algorithms.


I. INTRODUCTION
Lung cancer is the leading cause of cancer-related death worldwide. The American Cancer Society estimates that 1,688,780 new cancer cases and 600,920 cancer deaths will be diagnosed in the United States in 2017. The five-year survival rate of lung cancer patients is only 18%, because most lung cancer patients are diagnosed at an advanced stages [1]. Computed tomography (CT) is one of the most common imaging modalities for examining and screening lung cancer. Several studies have proved that early diagnosis of lung cancer could help improve survival rate and reduce the mortality rate by up to 20% [1]. Lung cancer potentially manifests itself as pulmonary nodules in an early stage. Although most The associate editor coordinating the review of this manuscript and approving it for publication was Gina Tourassi. pulmonary nodules with a size equal to or less than 5 mm are benign, distinguishing lung cancer from benign lesions is crucial. Therefore, an early diagnosis of pulmonary nodules is essential to determine whether the treatment is necessary.
The accurate segmentation of pulmonary nodules in CT images is an important task for the early diagnosis of lung cancer. Traditionally, the pulmonary nodules are segmented by the radiologists. However, manual segmentation is time-consuming and prone to intra-and inter-observer variability, and thus making the results unreliable. In addition, the vast amount of CT images to be analyzed bring an invisible burden for radiologists. Hence, automatic segmentation of pulmonary nodules in the thoracic CT images is an area of ongoing and extensive research. Clinicians burdened in the situations, which occur very often in clinical practices, will indeed benefit from the automated segmentation methods. VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ Segmentation of pulmonary nodules still remains a challenging task due to intra-nodules inhomogeneity, low contrast, blurred boundaries, variable sizes and irregular shapes. During the last decades, plenty of efforts have been devoted to pulmonary nodule segmentation in the literature, such as threshold-based methods [2], region growth-based methods [3], morphology-based methods [4], [5], shape-based methods [6], Hessian-based methods [7], [8], active contourbased methods [9]- [11], level set-based methods [12], [13], clustering-based methods [14], [15], local shape analysis methods [16] and machine learning-based methods [17], [18]. Threshold-based and region growth-based methods can achieve good segmentation performance for solitary pulmonary nodules, because intensity distributions are distinguishable between the nodule and background. The segmentation of other pulmonary nodules often fails due to the strong overlap of respective intensity distributions, such as juxta-vascularized nodule, juxta-pleural, and groundglass opacity (GGO). Shape-based methods have been well studied for pulmonary nodule segmentation. They model a nodule either as a spherical or ellipsoidal shape. However, due to the large morphology variability of pulmonary nodules, their geometrical assumptions are not valid. Active contour and level set methods have been attracting more and more attentions for pulmonary nodule segmentation. Although these deformable methods have obtained the satisfactory segmentation performances, they are sensitive to noise and depend on the location of initial contour. In addition, due to the use of a huge number of iterations, these deformable methods demand the high computational cost. Therefore, it is of great interest to develop a simple and general algorithm for pulmonary nodule segmentation.
Fuzzy C-means (FCM) algorithm has been widely used in image segmentation. The main drawback of the conventional FCM algorithm is sensitive to noise due to the lack of the spatial information. However, the presence of intensity inhomogeneity and noise is inevitable in the CT images. Although the variations of FCM algorithm yield the desirable results for image segmentation, there still exist some shortcomings. For example, some variations suffer from the limitation of segmentation accuracy. Another variations can suffer from the problem of noise and outliers. Selecting the initial cluster centers is also a challenging problem.
These drawbacks significantly hinder the applications of FCM algorithm on pulmonary nodule segmentation. To address these problems, a GMM fuzzy C-mean (GMMFCM) algorithm is proposed for pulmonary nodules segmentation in this paper. Distinct from the FLICM algorithm, the proposed algorithm incorporates GMM prior knowledge into the traditional FCM algorithm by defining a new local similarity measure.
The contributions of this paper are as follows: 1) A novel multiscale dot enhancement filter is proposed. The traditional multiscale dot enhancement filter is sensitive to image noise due to the calculation of second order derivatives of Hessian matrix. To address this problem, a novel multiscale dot enhancement filter is proposed by incorporating the Hessian matrix and shape index (SI) to alleviate the interference of the intensity inhomogeneity within pulmonary nodules, as well as avoid the influence of image noise and surrounding tissues.
2) The seeds of pulmonary nodule and background are automatically selected. How to choose the proper seeds is very important for the likelihood estimation of GMM model. In this paper, the nodule seeds are chosen from the response results of the multiscale dot filter, and the background seeds are chosen by jointing the shape index and texture properties.
3) A novel local similarity is proposed by using GMM prior knowledge. The traditional FLICM algorithm fails to discriminate the nodule pixels and non-pixels in CT images, resulting in non-optimal segmentation results. This severely restricts the application of FLICM algorithm. To overcome this limitation, a novel local similarity measure is proposed by using GMM based on posterior probability. The posterior probability is calculated according to nodule and background seeds.
4) The initial cluster centers selection. For the previous FCM algorithms, the initial cluster centers were selected randomly. In this paper, the initial cluster centers of the nodules are chosen from the multiscale dot filter, and the initial cluster centers of background are chosen from background seeds.
The remainder of this paper is organized as follows: Section II gives the previous studies related to our work. Section III presents a detailed description of the proposed GMMFCM algorithm for segmentation of pulmonary nodules. Section IV describes the results of pulmonary nodule segmentation, as well as the comparison of the proposed with other related segmentation algorithms on the LIDC dataset and the GHGZMCPLA dataset. Section V gives the discussions of this paper. Finally, section VI provides the conclusions and future work.

II. RELATED WORKS
Fuzzy C-Means (FCM) clustering is one of the most popular clustering algorithms. FCM is a soft clustering algorithm in which each image pixel can belong to more than one cluster with different degrees of membership. FCM has been successfully applied in several image processing problems due to its simplicity and efficiency, such as image clustering [19]- [21], image segmentation [22]- [26] and image classification [27]- [32]. The traditional FCM algorithm achieves the segmentation results by minimizing the sum of the distance between the pixel and its corresponding cluster centers. Although it works well on intensity homogeneous and noise-free images, it fails to segment images corrupted by noise and other imaging artifacts. It is because it does not consider the spatial information and uses the square of Euclidean distance. However, medical images are often corrupted by noise and the artifacts due to the physical mechanisms of the acquisition process and the movement of patient's body. Many variants and extensions have been developed in the literature. Many researchers took the local spatial information into account. For example, Huang et al. [23] proposed an FCM_S algorithm for segmentation of brain MIR images. The authors introduced a spatial neighborhood term into the original objective function of FCM algorithm. The main shortcoming of FCM_S was its computational complexity and storage requirement, because the neighborhood labels of pixels were calculated in each iteration step. Its variants FCM_S1 and FCM_S2 algorithms were developed to reduce the computational burden of the spatial neighborhood term. To further speed up the convergence rate, Cai et al. [33] proposed fast generalized FCM (FGFCM) for image segmentation. The authors combined the spatial and gray levels information to construct a local similarity measure, then formed a non-linearly weighted sum image. The clustering process was performed on the summed image rather than the original image. Unfortunately, some important information of the original image may be lost, thus resulting in degrading image segmentation performance.
Another critical issue is how to determine the crucial parameters, which controls the tradeoff between robustness of noise and effectiveness of preserving image details. However, it is very difficult to select appropriate parameters. To address these problems, Krinidis et al. [34] proposed a FLICM algorithm for image segmentation. The authors introduced a fuzzy factor to replace the parameters used in FCM_S algorithm and its variants. Although FLICM algorithm achieved the desired segmentation results, there are still some shortcomings in keeping the image edge. Li et al. [35] proposed a modified FLICM algorithm for image segmentation, which introduced edge and local information to reduce edge degradation. Gong et al. [36] proposed a variant of FLICM algorithm, which introduced a tradeoff weighted fuzzy factor and kernel metric for image segmentation. Although FLICM algorithm and its variants considered the influence of the neighboring pixels, they failed to make full use of the information of the center pixel in the local window. Thus, the borders and edges of some regions may be over-smoothed. To reduce the smoothness and further improve the segmentation accuracy, Ji et al. [37] proposed a RSCFCM algorithm for brain MR image segmentation. The authors introduced a spatial factor based on posterior probabilities and prior probabilities to alleviate the disturbance of noise and intensity inhomogeneity. The dissimilarity function was constructed by considering the prior probabilities. Recently, Zhang et al. [38] proposed an ADFLICM algorithm by introducing a fuzzy local similarity measure to replace the fixed parameter of FCM_S algorithm. The main advantage of the fuzzy local similarity measure is that it can adaptively vary by the gray level and local spatial relationships between the center pixel and its neighbors in a local window. Zhao et al. [39] proposed interval type-2 fuzzy C-Mean clustering algorithm for color image segmentation. A novel interval type-2 fuzzy clustering objective function is constructed by utilizing the intuitionistic fuzzy information extracted from images. Singh and Bala [40] proposed the local and nonlocal FCM, the distance function of the FCM is represented as the sum of the local and nonlocal distances which themselves are the weighted values of the Euclidean distance used in the FCM.

III. METHODOLOGY
In this paper, a GMM fuzzy C-means clustering algorithm is proposed for pulmonary nodule segmentation. The flowchart of the proposed method is shown in Fig. 1. First, the original CT thoracic images are smoothed by non-local mean filter and down-sampled by Gaussian pyramid. Second, the pulmonary parenchyma is segmented by superpixel-based random walker algorithm. To generate the nodule seeds, pulmonary nodules are enhanced by a new multiscale dot enhancement filter. To generate background seeds, a scheme is proposed by jointing shape index and texture features. Then, the nodule and background GMM models are built by generating the seeds. Finally, the pulmonary nodules are segmented by GMM fuzzy C-mean algorithm. The details of each step are described in the following sections.

A. PREPROCESSING
Due to the imaging mechanism of acquisition process and the movement of the patient body, thoracic CT images are inevitably contaminated by the noise and artifacts. Therefore, the preprocessing is necessary to reduce the noise and enhance the contrast.
Firstly, a non-local mean (NLM) filter with a mask of 3× 3 is adapted in this section. The non-local mean (NLM) filter is capable of reducing the noise without sacrificing the details of the image. Secondly, the Gaussian pyramid is employed to reduce the image resolution by half, which reduces the computational complexity and accelerates the convergence rate. The pyramid is a sequence of smoothed images generated by a Gaussian filter. The basis of the Gaussian pyramid is VOLUME 8, 2020 formulated in equation (1).
where l is a level of the Gaussian pyramid. i and j are the spatial coordinate positions of x-and y-direction at the l-th level, respectively. g l−1 is the Gaussian smoothed image at the l − 1-th level. w (m, n) denotes a 5 × 5 Gaussian filter, which is used to the neighboring pixels of a pixel (i, j). g 0 denotes an initial image, i.e. an input image at the level l = 0. g l is generated by smoothing the image g l−1 with a Gaussian filter w (m, n). Fig. 2 shows a CT image with a nodule. Fig. 2(a) is an original image with a well-circumscribed nodule. Fig. 2(b) is the down-sampled result by Gaussian pyramid. The image size of 512 × 512 is down-sampled to the size of 256 × 256. This image is directly applied to segment pulmonary parenchyma process.

B. GMM FUZZY C-MEANS
The traditional FCM algorithm neglects the spatial information, hence it is more sensitive to noise. Although the FLICM algorithm introduced a fuzzy factor to alleviate the image noise and preserve the image details, it has the limitation in discriminating the boundary pixels. To address this problem, a GMM fuzzy C-mean algorithm (GMMFCM) is proposed for pulmonary nodule segmentation. The proposed GMM-FCM takes into account the spatial relationship between the neighboring pixels, and computes the posterior probability of the pixels belonging to each cluster. A new local similarity measure is defined based on the posterior probabilities of the GMM models. Then, the objective function of the GMMFCM algorithm is defined. The details of each step are described in the next sections.

1) DEFINITION OF LOCAL SIMILARITY MEASURE
The objective function of FCM algorithm is iteratively minimized to find a solution to the problem. The high membership degree is assigned to the pixel whose intensity value is close to the center of the corresponding cluster, and the low membership degree is assigned to the pixel whose intensity value is far from the center. Hence, the membership degree is sensitive to the presence of noise and intensity inhomogeneity.
The FLICM algorithm introduced a fuzzy factor to reduce the image noise and simultaneously preserve image details. The fuzzy factor characterizes the spatial relationships between the pixel and its neighbors. However, it is unable to handle the boundary pixels. It is well-known that some pulmonary nodules and adjacent structures share the similar intensities, thus resulting in the fuzzy boundaries. To address this problem, a novel local similarity measure is introduced for pulmonary nodule segmentation by incorporating the local spatial information and the GMM statistics information. For two pixels i and j, the spatial measure is defined in equation (2).
where d (·) is a spatial distance between two neighboring pixels i and j. P ki and P kj denote the posterior probabilities of the pixels i and j by fitting the Gaussian Mixture Model (GMM) of the cluster k. d ij is the spatial Euclidean distance between the pixels i and j. c is the number of clusters. A novel local similarity measure is defined in equation (3).
where the i-th pixel is the center of local window N i , and the r-th pixel is the neighborhood pixel that falls into N i . The proposed local similarity function does not involve any adjusted parameters. After the definition of local similarity measure, the objective function of GMMFCM algorithm is defined in the following section.

2) FORMULATION OF PROPOSED GMMFCM ALGORITHM
The objective function of the proposed GMMFCM algorithm is defined in equation (4).
where u ki is the degree of membership of the i-th pixel belonging to the k-th cluster. v k is the prototype of the k-th cluster. m is a fuzzy index, which determines the level of cluster fuzziness of the membership grades. N R is the cardinality of the neighborhood system N i . · denotes the norm operator. α is the tradeoff parameters, which controls the effect of the factor terms. The first term is the objective function used in traditional FCM algorithm, which assigns a high membership to the image pixels whose intensity values are close to the center of the particular cluster and a low membership to the image pixels whose intensity values are far from the center of the particular cluster. The second term is a local similar 37544 VOLUME 8, 2020 regularizer term, whose weight factor is automatically determined by incorporates the spatial relationship and the prior knowledge information of two neighboring pixels. Therefore, local spatial relationship was adaptively changed and more local spatial information was considered. According to the Lagrange multiplier method, the Lagrange function of (4) is defined in equation (5).
By the partial derivative of the Lagrange function (5) with respect to the variances u ki , v k , and setting them to be zero, the membership u ki and the center v k are updated according to the updating equations (6) and (7).
After cluster centers are updated, the stopping criterion is used to verify whether cluster centers have been converged or not. If cluster centers have been converged, then the algorithm is applied again to obtain the new fuzzy partitions and cluster centers. This process is repeated until the cluster centers are converged or the maximum number of iterations is reached. The GMMFCM algorithm is summarized in Algorithm 1.

3) BUILD OF NODULE AND BACKGROUND GMM MODELS
To calculate the posterior probabilities of local similarity function, the GMM models of the nodule and background are built by using the generated seeds in this section. To reduce the computational complexity, pulmonary parenchyma is segmented by the proposed superpixel-based random walker algorithm. Then, the nodule and background seeds are automatically generated by the proposed multiscale dot filter and a scheme of background seed generation.
In the case of normal lung, air-filled pulmonary parenchyma and surrounding tissues have a large density difference in the CT images. The simple thresholding methods are often used to segment the pulmonary parenchyma. However, segmentation of pulmonary parenchyma with highdensity pulmonary nodules is a nontrivial problem. This is The number of cluster c, the stopping condition ε, fuzzy index m.

Output:
Clusters v 1 , v 2 , · · · , v c and the new U . 1: Initial the fuzzy partition matrix U and the center vector V ; 2: Set the loop counter b = 0; 3: Calculate the new local similarity measure w GMM ir by equation (3); 4: Update the cluster center V by using equation (6); 5: Update the fuzzy partition matrix U by using equation (7); 6: Repeat the steps 3, 4 and 5, until the stopping criterion is satisfied. max Then, the iteration will stop; otherwise, let b = b + 1 and go back to the step 3 and repeat. 7: When the algorithm converges, a new fuzzy partition matrix U is obtained. Then, the optimization process can assign a pixel i to the cluster C with the largest membership value. 7: When the algorithm converges, a new fuzzy partition matrix U is obtained. Then, the optimization process can assign a pixel i to the cluster C with the largest membership value.
because lungs with pulmonary nodules may have density values similar to other anatomical tissues surrounding the lung regions. Only a few literature have been published that handle segmentation of lungs with pulmonary nodules [41], [42]. None of the existing segmentation methods directly segment lungs with pulmonary nodules at arbitrary locations.
Recently, random walker (RW) algorithm has become a hot topic in the field of computer vision. Regarding medical image segmentation, some researches have been aware of its advantages, and also have shown its benefits in segmentation of different organs and tissues [43]- [46]. The solution of the RW model is obtained by solving a linear system according to the norm used to define the energy function. So, the total time complexity depends on the number of pixels. Therefore, the computational complexity and memory cost can be enormous for a large image. To address this problem, the superpixel-based random walker (SRW) model is proposed for pulmonary parenchyma segmentation, which consists of two steps. The first step is to over-segment the CT image into some small regions via SLIC superpixel algorithm and the second step is to build RW model for pulmonary parenchyma segmentation.
Simple Linear Iterative Clustering (SLIC) superpixel algorithm [47] is employed to efficiently generate superpixels VOLUME 8, 2020 due to its simplicity and efficiency. Each input image is over-segmented into a set of superpixels {R i } M i=1 by SLIC superpixel algorithm, where M is the number of superpixels. Based on superpixels, an undirected weighted graph G = (V , E) is constructed, where each superpixel is represented as a node and each edge connects two neighboring superpixels. Due to the great reduction of the number of image primitives, superpixel-level RW model can significantly reduce the computational complexity and memory requirement compared to pixel-level RW model. In addition, these superpixels provide compact and structural information, which can reduce the risk of assigning error labels to the corresponding pixels.
For each pixel within a superpixel, intensity, texture, gradient features and spatial distance are extracted. A multivariate normal distribution N u i , i of a superpixel is modeled in the eight dimensional feature space. Given two neighboring superpixels R i and R j , we measure the similarity of their probability distributions based on Kullback-Leibler (KL) divergence. The more similar two probability distributions are, the smaller the KL divergence is. The new weighted matrix is defined in equation (8).
where f i and f j are the multivariate Gaussians with their distribution N u i , i and N u j , j . KL f i , f j denotes the Kullback-Leibler (KL) divergence between two d-dimensional Gaussian distributions, which is calculated in equation (9).
By minimizing the energy function of SRW model, the probability of a superpixel R i is close to that of its neighboring superpixel R j when a high weight w KL R i , R j is given. That is, two neighboring superpixels with similar probability distributions can get similar probabilities. Fig. 3 shows the segmentation results for a juxta-pleural nodule. Fig. 3(b) and (c) show results by using the conventional RW (red contour) and the proposed SRW algorithm (green contour) for pulmonary parenchyma segmentation. The conventional RW algorithm is more prone to produce incorrect results, as shown Fig. 3(b). This is because the Juxtapleural nodules may damage the pleural surfaces, resulting in disrupting the integrity of the boundary of the lung. Some local segmentation errors of the pulmonary parenchyma with pulmonary nodules are indicated by red arrows. As shown Fig. 3(c), the SRW algorithm achieves the desirable segmentation results compared with the conventional RW algorithm for pulmonary parenchyma segmentation. After pulmonary parenchyma segmentation, the nodule and background seeds are generated to build the GMM models. A detailed description is given in the next section. To generate the nodule seeds, a new multiscale enhancement dot filter (NMEDF) is proposed in this section. The MEDF was firstly proposed by Li et al. [48] for pulmonary nodule enhancement. The MEDF can enhance dot-like pixels while suppressing other surrounding tissues, such as line-like vessels and planar-like airway walls. The traditional MEDF is highly sensitive to image noise, partial volume effects and patient motion. This is because the calculation of the Hessian matrix is based on the second-order partial derivatives. The novel MEDF incorporates Hessian matrix and the shape index [49], which is more robust against the intensity inhomogeneity within pulmonary nodules and image noise, as well as avoid the influence of surrounding tissues.
Let λ 1 and λ 2 denote two eigenvalues of the Hessian matrix in two-dimensional (2D) image space, which are ordered based on their magnitudes such as |λ 2 | ≥ |λ 1 |. The shape index (SI) is calculated in equation (10).
Then, the likelihood function of the novel multiscale dot enhancement filter is defined in equation (11).
where g (SI (σ i )) is an indicator function such that g (SI (σ i )) when the value of SI (σ i ) is greater than or equal to 0.75 to find only dot-like shape. Otherwise, when the value of SI (σ i ) is less than to 0.75, the value of g (SI (σ i )) is equal to 0. If the sizes of pulmonary nodules with diameter range from d 0 to d 1 , then each scale factor is calculated in equation (12).
where r = (d 1 /d 0 ) 1 N −1 and N is the number of the scale factors. At each scale σ , the finally output of enhancement filter is calculated as the maximum response for a pixel x, which is represented in equation (13).
The nodule seeds are generated by thresholding the enhanced pulmonary nodule image. To generate the background seeds, shape index and texture features are used. The pixels with shape index map less than a threshold T 1 are extracted as the background seeds. However, the pixels may be mistakenly identified by shape index. To address this problem, the texture feature is considered. The Gabor transform is employed to generate the texture map. All pixels with texture map less than a threshold T 2 are extracted as the background seeds. The finally background seeds are obtained by jointing shape index and texture properties. After the seeds are obtained, the nodule and the background GMMs are built to calculate the posterior probability of local similarity function. Fig. 4 shows the generation results of nodule and background. Fig. 4(a) is a segmented pulmonary parenchyma with a pulmonary nodule. Fig. 4(b) and (c) are the nodule enhancement results by the traditional NMEDF and the proposed NMEDF, respectively. As shown in Fig. 4(b) and (c), the proposed MEDF obtains a better nodule enhancement results and suppresses more background regions. The red arrows indicate the enhanced nodules. The good result benefits from the incorporation of Hessian matrix and shape index. After pulmonary nodule enhancement, a suspicious pulmonary nodule within the region of interest is obtained by threshold-based technique, which is shown in Fig. 4(d). Fig. 4(e) is the nodule seeds obtained by thresholding the enhanced image, which are spotted with red points. Fig. 4(f) and (g) are background seeds obtained by thresholding the shape index map and the texture map, respectively. As shown in Fig. 4(f), a small part of the pixels within the pulmonary nodule region are regarded as the background seeds (red arrows) when the shape index is only used. As shown Fig. 4(h), the wrong background seeds is removed by jointing the shape index and texture properties.

IV. EXPERIMENTAL SETUP
In this section, the experiments are performed on two datasets to validate the performance of the proposed method. The segmentation performance of the proposed GMMFCM algorithm is evaluated by the qualitative and quantitative analysis. We compare the accuracy and effectiveness of the proposed algorithm with FCM, FCM_S, FCM_S1, FCM_S2, FLICM, FCM-type and Non-FCM algorithms. All experiments are run in the MATLAB platform on a PC with Intel E3-1225 CPU (3.31 GHz) and 4 GB RAM.

A. DATASETS
The LIDC dataset is the largest publicly available computed tomography (CT) images for validating the segmentation or classification performance of pulmonary nodules, which consists of 1018 cases from seven academic centers and eight medical imaging companies worldwide. Each subject contains two parts: the images from a clinical thoracic CT scan and the corresponding XML file that records the segmentation results of two-phase image annotation process performed by four experienced chest radiologists. The diameters of pulmonary nodule range from 2.03 mm to 38.12 mm, and the intervals of the slices range from 0.45 mm to 5.0 mm. All pulmonary nodules are segmented by up to four radiologists. The coordinates of the outlines of the nodules ≥ 3 mm are individually marked by four radiologists. The nodules < 3 mm are represented by a single coordinate.
In this paper, the nodules ≥ 3 mm (a total of about 893 pulmonary nodules) are selected to conduct the experiments. The nodules < 3 mm are not considered. Due to the inter-variability of segmentation results among four radiologists, a 50% consensus criterion [4] is used to produce the outline of ground truth. For all 893 nodules, 200 nodules are randomly selected for performance evaluation in this paper.

B. EVALUTION METRICS
To quantitatively evaluate segmentation performance of the proposed GMMFCM algorithm, seven evaluation criteria are used in this paper, including Accuracy, Sensitivity, Specificity, False positive ratio (FPR) and False negative ratio (FNR), Overlap score and Dice similarity coefficient (DSC).
Accuracy is the proportion of correctly identified pixels in the image segmented by the algorithms. It is calculated in equation (14).
Sensitivity is the proportion of correctly identified as nodule pixels and specificity is the proportion of correctly identified as background pixels, which are calculated in VOLUME 8, 2020 equation (15) and (16).
FPR is the fraction of falsely identified as nodule pixels and false negative ratio (FNR) is the fraction of falsely identified as background pixels, which are defined in equation (17) and (18).
Overlap score is a similarity measure, which reflects how the segmentation result of the algorithms matches the ground truth. It is calculated in equation (19).
DSC is an overlap measure between the segmented pulmonary nodule and ground truth. It is calculated in equation (20).
where true positive (TP) is the number of correctly identified as nodule pixels. False positive (FP) is the number of wrongly identified as nodule pixels. True positive negative (TN) is the number of correctly identified as background pixels. False Negative (FN) is the number of wrongly identified as background pixels. The values of seven evaluation criteria range from 0 to 1. The larger accuracy, sensitivity, specificity, overlap score and DSC are, the higher similarity between the segmented nodule and ground truth is. The smaller FPR and FNR are, the better segmentation performance of pulmonary nodules is.

C. PARAMETER SRTTING
In this paper, the parameters of the proposed GMMFCM method were optimized in all experiments. In the pulmonary nodule enhancement stage, the parameter σ should enough large to cover all the size of nodules, which is set the range from 2.03 mm to 68.43 mm at steps of 0.5 for all the experiments. Hence, the small or large pulmonary nodules can be successfully enhanced. d 0 and d 1 are set as 4 and 36 mm in the implementation, respectively. In the pulmonary nodule segmentation stage, there are four parameters, namely, T 1 , T 2 , α. In the case of the generation of nodule and background seeds method, the thresholds T 1 and T 2 control the numbers of nodule and background seeds, which are set to 0.9 and 0.7 to identify the image pixels that belong to the background, respectively. The parameter α controls the effect of the factor terms. If the parameter α is set to 0, the proposed GMMFCM algorithm will degenerate to the conventional FCM algorithm. The segmentation performance of the proposed GMMFCM algorithm depends on the choice of parameters. To validate the dependence of segmentation accuracy on the regularization parameter α, a trial-and-error process is used to estimate the parameters. Firstly, the regularization parameter α were set the initial value α init = 0. Then, a range of parameter values are set to verify the performance of the proposed algorithm. Fig. 8 shows the segmentation results using the proposed algorithm with varying the parameter α. As shown in Fig. 5, the segmentation results slightly go down with the increasing the parameter α, which are shown in Fig. 5(b), (c) and (d). In a proper range, the proposed algorithm remains relatively stable for the regularization parameter, which are shown in Fig. 5(e), (f) and (g). After the several iterations, the segmentation results are desired when the parameters > 0.6. Based on the above analysis it can be concluded, that the parameter are directly set to α = 0.8 and in all experiments to obtain the best segmentation results. All experiments are performed on two datasets based on these optimal parameters.

D. QUALITATIVE RESULTS AND COMPARISONS
In this section, some qualitative comparisons are conducted on two datasets to evaluate the segmentation accuracy of the proposed algorithm.

1) QUALITATIVE EVALUATION ON THE LIDC DATASET
To qualitatively evaluate the proposed GMMFCM algorithm on the LIDC dataset, we apply the proposed algorithm to segment the CT slice with various types of pulmonary nodules and compare the segmentation results with ground truths. Four CT slice examples with different types of pulmonary nodules are obtained from the LIDC dataset for comprehensive validation of the proposed algorithm. The segmentation results are compared with that of the corresponding ground truths. The red and green contour indicate the segmentation results by the proposed algorithm and ground truth, respectively.
The visualization of the obtained segmentation results is shown in Fig. 6. The CT images after the preprocessing are shown in the leftmost column of Fig. 6. The red rectangles indicate the pulmonary nodule regions. The second column of Fig. 6 shows the close-ups of the segmentation results. From the second column of Fig. 6, we can see clearly that the results of the proposed algorithm is very close to the pulmonary nodule boundaries. As shown in the rightmost column of Fig. 6, the proposed GMMFCM algorithm produces the segmentation results closer to the ground truths. Although some pulmonary nodules are extremely subtle with fuzzy boundaries or homogeneous intensity, the proposed GMM-FCM algorithm can successfully segment theses nodules. The experiment results demonstrate that the proposed GMMFCM algorithm achieves the desired segmentation results for different types of pulmonary nodules from the LIDC dataset.
To qualitatively evaluate the superior segmentation results of the proposed GMMFCM algorithm, three comparison experiments are conducted on the LIDC dataset. The comparison experiments are performed by using five different algorithms (i.e., FCM, FCM_S, FCM_S1, FCM_S2, FLICM). Each algorithm directly uses the best parameters in the experiments and then we perform the experiments on three different types of pulmonary nodules, including a juxta-vascularized nodule, a juxta-pleural nodule and a GGO nodule. Fig. 7 illustrates the segmentation results derived from the FCM, FCM_S, FCM_S1, FCM_S2, FLICM and GMMFCM algorithms for a juxta-vascularized pulmonary nodule, respectively. Fig. 7(a) is a lung CT slice with a juxta-vascularized pulmonary nodule. Fig. 7(b)-(f) are the segmentation contours obtained by using FCM, FCM_S, FCM_S1, FCM_S2, FLICM algorithms. As shown in Fig. 7(b), FCM algorithm produces the inferior segmentation result for the juxta-vascularized nodule. Many pixels belonging to the vessel structure are misclassified as a part of the pulmonary nodule. As shown in Fig. 7(c)-(e), the FCM_S, FCM_S1, FCM_S2 algorithms also produce the serious over-segmentation results for the juxta-vascularized nodule. Although they improve the traditional FCM algorithm to some extent by considering the spatial neighboring information, the negative effects of intensity similarity profiles between the pulmonary nodule and its adjacent vessel structure cannot be completely resolved. The FLICM algorithm obtains an improved segmentation result by introducing a fuzzy factor, which is shown Fig. 7(f). The most of the over-segmentation pixels are removed by the FLICM algorithm, but some pixels belonging to the vessel structure still remain, which are indicated by yellow arrows. This is because the FLICM algorithm has some shortcomings in identifying the pixels of weak boundary. Fig. 7(g) shows the segmentation contour of the proposed algorithm. As observed from Fig. 7(g), the proposed algorithm corrects the pixels who are misclassified as nodule pixels, and produces the desirable segmentation result. The proposed algorithm incorporates the spatial and statistical information to reduce boundary degradation. Fig. 8 illustrates the segmentation results derived from the FCM, FCM_S, FCM_S1, FCM_S2, FLICM and GMMFCM algorithms for the juxta-pleural pulmonary nodule, respectively. Fig. 11(b)-(f) are the segmentation contours obtained by using FCM, FCM_S, FCM_S1, FCM_S2, FLICM algorithms. As shown in Fig. 8(b)-(e), the FCM, FCM_S, FCM_S1, FCM_S2 algorithms also produce the over-segmentation results for the juxta-pleural pulmonary nodule. From Fig. 8(f), we can clearly observe that the FLICM algorithm produces slightly over-segmentation result for juxta-pleural nodule, which is indicated by yellow arrows. The negative effects of intensity similarity profiles between the pulmonary nodule and its adjacent pleural structure also cannot be completely resolved. This is because the FLICM algorithm fail to handle the pixels of weak boundary. The proposed algorithm produces the desirable segmentation result for the juxta-pleural pulmonary nodules, which is shown in Fig. 8(g). The segmentation results obtained by the FCM, FCM_S, FCM_S1, FCM_S2, FLICM and GMMFCM algorithms for the GGO nodule are illustrated in Fig. 9. A lung CT slice around a GGO pulmonary nodule is shown Fig. 9(a).    Fig. 9(b)-(e), the FCM, FCM_S, FCM_S1, FCM_S2 algorithms also produce the over-segmentation results for the GGO nodule. The segmentation result of the FLICM algorithm for a GGO pulmonary nodule are shown in Fig. 9 (f). From Fig. 9(f), we can observe that the FLICM algorithm produces slightly over-segmentation result for GGO nodule, which is indicated by yellow arrows. The negative effects of intensity inhomogeneity within the pulmonary nodule also cannot be completely resolved by the FLICM algorithm. The proposed algorithm yields the satisfactory segmentation result for the GGO pulmonary nodule, which is shown in Fig. 9(g).

2) QUALITATIVE EVALUATION ON THE GHGZMCPLA DATASET
The segmentation performance of the proposed GMMFCM algorithm is also evaluated on the GHGZMCPLA dataset. In a comparison experiment, the proposed GMMFCM algorithm is performed to segment the CT slices around various types of pulmonary nodules and compare the segmentation results with ground truths. Fig. 10 shows the segmentation results of some examples. It is clearly seen that these examples are corrupted by intensity inhomogeneity and fuzzy boundary. For each pulmonary nodule, four images are shown in Fig. 10. They are, from left to right, the preprocessed image, the segmentation result using the proposed algorithm, the ground truth and the comparison result between the segmented nodule and ground truth. As shown in the last column of Fig. 10, the proposed algorithm produces the segmentation results closer to the ground truths. The experimental results also demonstrate that the proposed algorithm obtains the satisfactory segmentation results for different types of pulmonary nodules from the GHGZMCPLA dataset. Another comparison experiment is conducted on the GHGZMCPLA dataset to evaluate the superior performance of the proposed GMMFCM algorithm. Fig. 11 shows the comparison results with FCM, FCM_S, FCM_S1, FCM_S2, FLICM algorithms. As can be seen from the last column of Fig. 11, the pixels belonging to nodule structures can be accurately identified. The comparisons of the algorithms can be similarly illustrated by the comparison experiments on the LIDC dataset. The qualitative results of comparison experiments on the LIDC dataset and the  GHGZMCPLA dataset demonstrate that the proposed algorithm can achieve satisfactory results for various types of pulmonary nodules. The good segmentation results could be attributed to that the proposed GMMFCM algorithm incorporates the spatial and statistical information to reduce the boundary degradation. The proposed GMMFCM algorithm introduces the new local similar measure to define the weighting factor, which is not only influenced by the spatial information of neighboring pixels but also the posterior probability belonging to the clusters, resulting in the improvement of segmentation results.

E. QUANTITATIVE RESULTS AND COMPARISONS
In this section, some comparison experiments are conducted on two datasets to quantitatively evaluate the segmentation accuracy and effectiveness of the proposed algorithm.

1) QUANTITATIVE EVALUATION ON THE LIDC DATASET
In a first experiment, to quantitatively evaluate the proposed GMMFCM algorithm on the LIDC dataset, we run the code of the proposed GMMFCM algorithm to segment the 100 CT slices around various types of pulmonary nodules and compare the segmentation results with ground truth. The segmentation results of only 23 cases are summarized in Table 1 due to the limited space of the paper. However, because these cases are randomly selected, the results is similar for all the considered cases. The first column and second columns show the IDs and the corresponding names. For each case, the Accuracy, Sensitivity, Specificity, Overlap score and Dice similarity coefficient (DSC) are given in columns number the three, four, five, six and seven. As observed from Table 1, an average accuracy of the proposed SRW_GMMFCM algorithm is 0.9997 ± 0.0002. The proposed algorithm also achieves an average sensitivity of 0.9257 ± 0.0545 and an average specificity of 0.9998 ± 0.0002, respectively. The results manifests that the proposed algorithm produces the less FPs than that of the ground truth. Fig. 12 shows the TABLE 1. Quantitative segmentation results using the proposed GMMFCM algorithm in terms of Accuracy, Sensitivity, Specificity, Overlap, DSC. validation results of the proposed GMMFCM algorithm on 23 cases. Fig. 12 (a) is the sensitivity curve. A high sensitivity of the proposed algorithm is achieved by the proposed algorithm. Fig. 12 (b) and (c) are the overlap and DSC curves, respectively. From Fig. 12(b) and (c), we can observe that the proposed algorithm still remains a high overlap and a high DSC, which indicate the high similarity between the segmented nodule and ground truth.
Another comparison experiment is conducted using FCM, FCM_S, FCM_S1, FCM_S2, FLICM, FCM-type [39] and Non-FCM [40] algorithms to quantitatively evaluate the superior performance of the proposed GMMFCM algorithm on the LIDC dataset. Table 2 summarizes the average values of accuracy, sensitivity, specificity, overlap, DSC, FPR and FNR for each algorithm of the six algorithms compared above. As shown in Table 2, the performance of FCM algorithm are slightly smaller than that of FCM_S algorithm for pulmonary nodule segmentation. This is because that FCM algorithm is sensitive to the image noise and intensity inhomogeneity, while the FCM_S algorithm considers the spatial neighboring information. As shown in the fifth row of Table 2, the FCM_S1 or FCM_S2 algorithm is slightly better than FCM_S but worse than FLICM algorithm. That is because the FLICM algorithm incorporates both the local spatial information and gray level relationship. The mean accuracy of the FLICM algorithm is 0.9991 and the mean overlap is 0.8204, which indicate the segmentation performance improvement of the FLICM algorithm. The mean accuracy of the proposed algorithm is 0.9995 and the mean overlap is 0.8581, which are higher than those of the FLICM algorithms for different types of pulmonary nodules. The higher overlap and DSC of the proposed algorithm indicate the higher similarity between the segmented nodule and ground truth. The sensitivity and specificity of the proposed algorithm is 0.9756 and 0.9999, which are also higher than those of the FLICM algorithm. The proposed algorithm obtains a lower FPR and a lower FNR, which also indicates the high segmentation performance. No matter which types of pulmonary nodules, the proposed algorithm still achieves a better segmentation performance than that of other five algorithms, as it uses the prior knowledge. The quantitative analysis of comparison results demonstrates that the prior knowledge can improve the segmentation performance when segmented various types of pulmonary nodules. Accordingly, as confirmed by the quantitative results, we can conclude that the incorporation of GMM posterior probability and spatial information after the enhanced images with pulmonary nodules leads to a better segmentation performance compared to the incorporation of spatial information alone. Therefore, the proposed GMMFCM algorithm can successfully segment different types of pulmonary nodules and significantly improve the performance of pulmonary nodule segmentation.

2) QUANTITATIVE EVALUATION ON THE GHGZMCPLA DATASET
To evaluate the effectiveness of the proposed GMMFCM algorithm on the GHGZMCPLA dataset, a comparison experiment is conducted using FCM, FCM_S, FCM_S1, FCM_S2, FLICM, FCM-type and Non-FCM algorithms. Table 3 shows the average values of accuracy, sensitivity, specificity, overlap and DSC metrics between the segmented nodules and ground truth on the GHGZMCPLA dataset. As shown in Table 3, the segmentation results of the proposed algorithm with a highest Overlap and DSC are better than the others for the segmentation of various types of pulmonary nodules. In addition, the sensitivity of 0.9868 and specificity of 0.9997 are also higher than the almost others. The experimental results manifest the robustness and effectiveness of the proposed algorithm on GHGZMCPLA dataset for pulmonary nodule segmentation.

F. COMPARISON WITH THE STATE-OF-THE-ART ALGORITHMS
To justify the accuracy and effectiveness of the proposed algorithm for pulmonary nodule segmentation, we compare the proposed algorithm with the state-of-the-art segmentation algorithms of pulmonary nodules. It is difficult to make a comparison with previously published literatures, as some of them usually use the private datasets or the subsets of public datasets, and sometimes, the published literature do not report the cases that are selected. However, it is very important to stress a fair comparison among the algorithms. Therefore, we attempt to compare the published performance results that use the LIDC dataset, which helps to mitigate one of the variability factors. Since some metrics used in different algorithms are inconsistent in the published literatures, the mainly reported overlap scores of these state-ofthe-art algorithms are listed in Table 4 for a completely fair comparison. We believe that the performance comparison results on the GHGZMCPLA dataset are sufficiently similar to those of the LIDC dataset. Here, each row represents a published algorithm for pulmonary nodule segmentation and lists the reported overlap score. The first through eighth rows shown the reported overlap score of eight algorithms. They are, from up to down, the algorithms reported by Kostis  because that W. J. Kostis's algorithm used an ellipsoid shape to model the pulmonary nodules. However, the pulmonary nodule boundaries were not strictly ellipsoid shape, which might miss some parts of real pulmonary nodule boundaries. The average overlap of B. van Ginneken's algorithm was 0.66 ± 0.18, which was higher than that of W. J. Kostis's algorithm and K. Okada's algorithm. This was because that B. van Ginneken's algorithm used the leave-one-out regime. However, it required the manual classification of pulmonary nodule into solid and non-solid nodules. The average overlap of J. M. Kuhnigk's algorithm and J. Wang's algorithm were 0.67 ± 0.22 and 0.64 ± 0.18, respectively. J. M. Kuhnigk's algorithm obtained a relative higher overlap than the above other algorithms. This was because that J. M. Kuhnigk's algorithm used an erosion operation to remove the adjacent vessel structures. However, it obtained a lower higher overlap than that of T. Messay2015's algorithm. Since it was difficult to determine the size of structure element, too large or too small size could have the great impact for segmentation accuracy. The erosion operation remove may remove a significant portion of the pulmonary nodule, because the narrowest part of nodule and the adjacent vessel have the same size. T. Kubota's algorithm obtained a higher overlap of 0.69 ± 0.18 than that of T. Messay2010's algorithm. The authors took full advantage of additional control points to improve the segmentation performance. However, this algorithm was inadequate to segment the invasive juxta-pleural nodules. T. Messay2015's algorithm reported the average overlap of 0.77 ± 0.09, which was a relative higher overlap than that of the T. Kubota's algorithm and was comparable to our algorithm. The competition-diffusion (CD) based figureground separation could effectively remove the partial volume effects, resulting in the better segmentation performance.
As observed from the last row of Table 4, the proposed algorithm achieves a competitive segmentation performance in comparison with other algorithms. Due to the similar intensity between the nodule and adjacent structures, the segmentation of juxta-vascularized nodules and juxta-pleural nodules is usually not so accurate by using these algorithms. It is worth noting that the proposed algorithm achieves the average overlap of 0.86 ± 0.06, which is the highest overlap scores among these state-of-the-art segmentation algorithms. The good segmentation performance of the proposed algorithm attributes to the incorporate of GMM prior knowledge and spatial information.
To evaluate the computational efficiency, the mean executive time is summarized in Table 5, which is measured in seconds on LIDC datasets. 23 cases with the nodules are selected in our experiment, which provide with different shapes, sizes and texture information. We run our code in 23 cases. The GMMFCM algorithm and Kostis's algorithm [50] obtain the mean executive time of 3.14 and 4.49, respectively. The executive time of GMMFCM algorithm is notably lower than other state-of-the-art algorithms. The result demonstrates that the proposed method is much faster than other state-of-theart algorithms.

V. DISCUSSIONS
The proposed GMMFCM algorithm can successfully segment various types of pulmonary nodules. The proposed superpixel-based random walker could reduce the computational complexity and the risk of assigning error labels to the corresponding pixels. The proposed multiscale dot enhancement filter could generate more reliable nodule seeds, and the proposed scheme of background seeds generation could generate more accurate background. The nodule and background GMM models were built by using the generated seeds. Then, the GMM models were used to define a local similarity measure. The new energy function was defined based on the local similarity measure to improve the segmentation performance. The comparison visual segmentation results of the proposed algorithm compared with the ground truths on two datasets were shown Fig. 6 and Fig. 10. For some complex cases, such as juxta-pleural, juxta-vascularized and GGO nodules, the comparison results with FCM, FCM_S, FCM_S1, FCM_S2 and LIFCM algorithms on the LIDC dataset were shown in Fig. 7, Fig. 8 and Fig. 9. The comparison results on the GHGZMCPLA dataset were shown in Fig. 11. The segmentation results were shown by the red contours and the ground truths were shown by the green contours.
The quantitative comparison was performed in terms of seven evaluation criteria, including accuracy (see Eqn. 13), sensitivity (see Eqn. 14), specificity (see Eqn. 15), FPR (see Eqn. 16) and FNR (see Eqn. 17), overlap score (see Eqn. 18) and DSC (see Eqn. 19). For comparison purpose, 100 images from the LIDC dataset were selected and each image was rescaled to 256 × 256 pixels to reduce the time cost. Considering the limited space of the paper, the five evaluation criteria of only 23 cases were shown in Table 1. Based on the numerical results shown in Table 1, it could be seen that the proposed algorithm was capable of segmenting various types of pulmonary nodules and obtained the desired segmentation results. The average overlap and DSC were 0.8528 and 0.9167, respectively.
In particular, the overlap and DSC were compared in Fig. 12(b) and (c), respectively. The sensitivity of pulmonary nodule segmentation were shown in Fig. 12(a).
For all cases, the quantitative segmentation results of the proposed algorithm were compared with the results provided by the FCM, FCM_S, FCM_S1, FCM_S2 and LIFCM algorithms, as well as some other state-of-the-art algorithms of pulmonary nodule segmentation. In particular, the LIFCM algorithm was comparable to the proposed algorithm. The comparison results of proposed algorithm with the conventional FCM, FCM_S, FCM_S1, FCM_S2 and LIFCM, FCM-type and Non-FCM algorithms on the LIDC dataset and the GHGZMCPLA dataset were shown in Table 2 and  Table 3, respectively. From Table 2 it can be clearly seen that the proposed algorithm provided the high overlap about 0.8581 and DSC about 0.8986, which indicated a high similarity between the segmentation results and ground truth. The similar results could be seen from Table 3, which compared the results on the GHGZMCPLA dataset. The segmentation performance of the proposed algorithm was improved by using the new local similar measure. The quantitative segmentation results of the proposed algorithm were compared with the results provided by the state-of-the-art algorithms of pulmonary nodule segmentation. Based on the numerical results shown in Table 4 it can be also seen that the proposed algorithm achieved the comparable segmentation results.

VI. CONCLUSION AND FUTURE WORK
In this paper, the GMMFCM algorithm is proposed for segmentation of the pulmonary nodules. We performed a detailed segmentation performance comparison with FCM, FCM_S, FCM_S1, FCM_S2, FLICM, FCM-type and Non-FCM algorithms. The experiments have been conducted on the LIDC dataset and the GHGZMCPLA dataset to test the performance of the GMMFCM algorithm. The results show that the performance of the proposed GMMFCM algorithm is promising for segmentation of the pulmonary nodules, and is more robust than the other algorithms.
Several other factors also play a role in the performance improvement of the proposed algorithm. Superpixel-based random walker algorithm is employed for pulmonary parenchyma segmentation, and a new multiscale dot enhancement filter is defined for nodule seed generation. The local similarity measure is defined by using the GMM posterior probability and spatial information. In particular, the overlap and sensitivity of the proposed algorithm are the higher compared with some other algorithms.
In future work, we will investigate the techniques of data sampling so that the proposed algorithm can be extended to large-scale segmentation problems. Further study on this topic will also include many applications of the GMMFCM algorithm in other problems.
BIN LI received the B.S. and Ph.D. degrees from the School of Automation Science and Engineering, South China University of Technology (SCUT), Guangzhou, China, in 2002 and 2007, respectively. He is currently an Associate Professor of automation science and engineering with SCUT. His current research interests include information visualization, medical image processing, and pattern recognition. VOLUME 8, 2020 FANG LIU received the Ph.D. degree from the South China University of Technology, Guangzhou, China. She is currently a Lecturer with the School of Internet Finance and Information Engineering, Guangdong University of Finance and Economics. Her research interests include transfer learning, privileged information, and action recognition.
HUA YIN received the Ph.D. degree from the Wuhan University, Wuhan, China. She is currently a Lecturer with the Information Engineering, Guangdong University of Finance and Economics, Guangzhou, China. Her current research interests include image segmentation, image processing, and pattern recognition.
FENG ZHOU received the B.S. degree in information computing science from Minnan Normal University, China, in 2010, and the Ph.D. degree in computational mathematics from Sun Yat-sen University, China, in 2015. As a Visiting Student, he studied at the School of Mathematics, Georgia Institute of Technology, from September 2013 to September 2014. He is currently an Assistant Professor with the School of Information, Guangdong University of Finance and Economics. His research interests include signal analysis, machine learning, and ensemble learning.