Skip to Main Content
Clustering is a popular technique for analyzing microarray data sets, with n genes and m experimental conditions. As explored by biologists, there is a real need to identify coregulated gene clusters, which include both positive and negative regulated gene clusters. The existing pattern-based and tendency-based clustering approaches cannot directly be applied to find such coregulated gene clusters, because they are designed for finding positive regulated gene clusters. In this paper, in order to cluster coregulated genes, we propose a coding scheme that allows us to cluster two genes into the same cluster if they have the same code, where two genes that have the same code can be either positive or negative regulated. Based on the coding scheme, we propose a new algorithm for finding maximal subspace coregulated gene clusters with new pruning techniques. A maximal subspace coregulated gene cluster clusters a set of genes on a condition sequence such that the cluster is not included in any other subspace coregulated gene clusters. We conduct extensive experimental studies. Our approach can effectively and efficiently find maximal subspace coregulated gene clusters. In addition, our approach outperforms the existing approaches for finding positive regulated gene clusters.