Skip to Main Content
Traditional clustering algorithms (e.g., the K-means algorithm and its variants) are used only for a fixed number of clusters. However, in many clustering applications, the actual number of clusters is unknown beforehand. The general solution to this type of a clustering problem is that one selects or defines a cluster validity index and performs a traditional clustering algorithm for all possible numbers of clusters in sequence to find the clustering with the best cluster validity. This is tedious and time-consuming work. To easily and effectively determine the optimal number of clusters and, at the same time, construct the clusters with good validity, we propose a framework of automatic clustering algorithms (called ETSAs) that do not require users to give each possible value of required parameters (including the number of clusters). ETSAs treat the number of clusters as a variable, and evolve it to an optimal number. Through experiments conducted on nine test data sets, we compared the ETSA with five traditional clustering algorithms. We demonstrate the superiority of the ETSA in finding the correct number of clusters while constructing clusters with good validity.