Skip to Main Content
The problem of clustering continuous valued data has been well studied in literature. Its application to microarray analysis relies on such algorithms as k-means, dimensionality reduction techniques, and graph-based approaches for building dendrograms of sample data. In contrast, similar problems for discrete-attributed data are relatively unexplored. An instance of analysis of discrete-attributed data arises in detecting co-regulated samples in microarrays. In this paper, we present an algorithm and a software framework, PROXIMUS, for error-bounded clustering of high-dimensional discrete attributed datasets in the context of extracting co-regulated samples from microarray data. We show that PROXIMUS delivers outstanding performance in extracting accurate patterns of gene-expression.