Block clustering via the block GEM and two-way EM algorithms | IEEE Conference Publication | IEEE Xplore

Block clustering via the block GEM and two-way EM algorithms


Abstract:

Summary form only given. Cluster analysis is an important tool in a variety of scientific areas such as pattern recognition, information retrieval, microarray, data minin...Show More

Abstract:

Summary form only given. Cluster analysis is an important tool in a variety of scientific areas such as pattern recognition, information retrieval, microarray, data mining, and so forth. Although many clustering procedures such as hierarchical clustering, k-means or self-organizing maps, aim to construct an optimal partition on the set of objects I or, sometimes, on the set of variables J, there are other methods, called block clustering methods, which consider simultaneously the two sets and organize the data into homogeneous blocks. These methods are speedy and can process large data sets. They require much less computations than if one works on I and J separately. The mixture model is undoubtedly one of the greatest contributions to clustering. Recently we have proposed a generalized EM algorithm (GEM) to maximize a variational approximation of the likelihood. The proposed algorithm is an iterative algorithm whose steps are carried out by the application of the EM algorithm on intermediate mixture models. This paper focus on the clustering context. It deals to compare block GEM and two-way EM, i.e. EM applied separately on I and J. Results on simulated data are given, confirming that block GEM gives much better performance than two-way EM.
Date of Conference: 06-06 January 2005
Date Added to IEEE Xplore: 13 June 2005
Print ISBN:0-7803-8735-X

ISSN Information:

Conference Location: Cairo, Egypt

1. Introduction

Cluster analysis is an important tool in a variety of scientific areas such as pattern recognition, information retrieval, microarray, data mining, and so forth. Although many clustering procedures such as hierarchical clustering, - means or self-organizing maps, aim to construct an optimal partition of objects or, sometimes, of variables, there are other methods, called block clustering methods, which consider simultaneously the two sets and organize the data into homogeneous blocks. If x denotes a data matrix defined by and , where is a set of objects (rows, observations, cases) and is a set of variables (columns, attributes), the basic idea of these methods consists in making permutations of objects and variables in order to draw a correspondence structure on .

Contact IEEE to Subscribe

References

References is not available for this document.