Skip to Main Content
Clustering analysis is a primary method for data mining. Density clustering has such advantages as: its clusters are easy to understand and it does not limit itself to shapes of clusters. But existing density-based algorithms have trouble in finding out all the meaningful clusters for datasets with varied densities. This paper introduces a new algorithm called VDBSCAN for the purpose of varied-density datasets analysis. The basic idea of VDBSCAN is that, before adopting traditional DBSCAN algorithm, some methods are used to select several values of parameter Eps for different densities according to a k-dist plot. With different values of Eps, it is possible to find out clusters with varied densities simultaneity. For each value of Eps, DBSCAN algorithm is adopted in order to make sure that all the clusters with respect to corresponding density are clustered. And for the next process, the points that have been clustered are ignored, which avoids marking both denser areas and sparser ones as one cluster. Finally, a synthetic database with 2-dimension data is used for demonstration, and experiments show that VDBSCAN is efficient in successfully clustering uneven datasets.