By Topic

Estimating the number of clusters in microarray data sets based on an information theoretic criterion

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Nicorici, D. ; Inst. of Signal Process., Tampere Univ. of Technol. ; Astola, J. ; Yli-Harja, O.

This study focuses on an information theoretic approach for estimating the number of clusters K, in microarray data sets. We present an automatic method for estimating K, based on a particular version of the normalized maximum likelihood (NML) model. The strength of the minimum description length (MDL) methods, such as the NML model, in statistical inference is to find the model structure which, in this particular clustering problem, amounts to find the best number of clusters and the best cluster structure for the data. The models are compared using the NML code length. The study introduces a new method for computing the code length of the encoded clustering vector for the data samples, based on the NML model. Experiments with publicly available microarray data sets demonstrate the ability of the new method to find the biologically meaningful clusters

Published in:

Statistical Signal Processing, 2005 IEEE/SP 13th Workshop on

Date of Conference:

17-20 July 2005