Abstract:
It is crucial to determine the optimal number of clusters for the clustering quality in cluster analysis. From the standpoint of sample geometry, two concepts, i.e., the ...Show MoreMetadata
Abstract:
It is crucial to determine the optimal number of clusters for the clustering quality in cluster analysis. From the standpoint of sample geometry, two concepts, i.e., the sample clustering dispersion degree and the sample clustering synthesis degree, are defined, and a new clustering validity index is designed. Moreover, a method for determining the optimal number of clusters based on an agglomerative hierarchical clustering (AHC) algorithm is proposed. The new index and the method can evaluate the clustering results produced by the AHC and determine the optimal number of clusters for multiple types of datasets, such as linear, manifold, annular, and convex structures. Theoretical research and experimental results indicate the validity and good performance of the proposed index and the method.
Published in: IEEE Transactions on Neural Networks and Learning Systems ( Volume: 28, Issue: 12, December 2017)