Skip to Main Content
Cluster analysis of gene expression data is useful for identifying biologically relevant groups of genes. However, finding the correct clusters in the data and estimating the correct number of clusters are still two largely unsolved problems. In this paper, we propose a new clustering framework that is able to address both these problems. By using the one-prototype-take-one-cluster (OPTOC) competitive learning paradigm, the proposed algorithm can find natural clusters in the input data, and the clustering solution is not sensitive to initialization. In order to estimate the number of distinct clusters in the data, an over-clustering and merging strategy is proposed. For validation, we applied the new algorithm to both simulated gene expression data and real gene expression data (expression changes during yeast cell cycle). The results clearly indicate the effectiveness of our method.