Skip to Main Content
We present an algorithm called HS-means which is able to learn the number of clusters in a mixture model. Our method extends the concept of clustering stability to a concept of hierarchical stability. The method chooses a model for the data based on analysis of clustering stability; it then analyzes the stability of each component in the estimated model and chooses a stable model for this component. It continues this recursive stability analysis until all the estimated components are unimodal. In so doing, the method is able to handle hierarchical and symmetric data that existing stability-based algorithms have difficulty with. We test our algorithm on both synthetic datasets and real world datasets. The results show that HS-means outperforms a popular stability-based model selection algorithm, both in terms of handling symmetric data and finding high-quality clusterings in the task of predicting CPU performance.