Cluster validation using a probabilistic attributed graph
Fred, A.
Jain, A.K.
Inst. Super. Tecnico, Lisbon;
This paper appears in: Pattern Recognition, 2008. ICPR 2008. 19th International Conference on
Publication Date: 8-11 Dec. 2008
On page(s): 1-4
Location: Tampa, FL,
ISSN: 1051-4651
ISBN: 978-1-4244-2174-9
INSPEC Accession Number: 10458131
Digital Object Identifier: 10.1109/ICPR.2008.4761787
Current Version Published: 2009-01-23
Abstract
We propose a new cluster validity index. A data partition is described by a set of disjoint sub-graphs, each corresponding to the minimum spanning tree of a cluster, taking as edge weight the dissimilarity between linked objects. Based on the assumption that each cluster has a characteristic parametric distribution of dissimilarity increments, graph probabilities are estimated. The validity index is defined as the minimum description length for both estimated model parameters and data partition, according to this probabilistic model. This new index can be used to evaluate various partitions of a given data set obtained by: (i) a single clustering algorithm, (ii) different clustering algorithms, or (iii) cluster ensemble methods. Experimental evaluation of the proposed index on synthetic and real data taken from the UCI repository confirms the usefulness of the method in selecting good clustering solutions.
Index
Terms
Available to subscribers and IEEE members.
References
Available to subscribers and IEEE members.
Citing Documents
Available to subscribers and IEEE members.