Skip to Main Content
Clustering is an important computational tool for identifying gene sets with similar profiles. Various clustering methods have been proposed for the analyses of gene expression data, among them, k-means is a widely used method because of its simplicity and computational speed which allows it to run on large datasets. Nevertheless, k-means need to determine the cluster number prior to clustering, which greatly influences the clustering results. This paper proposed a novel center closeness clustering algorithm that can automatically determine the cluster number based on the distances of data points. We used this proposed algorithm to cluster two gene expression data and compared the clustering results with those obtained by k-means. The cluster validity indices showed that our algorithm is obviously superior to k-means.