Skip to Main Content
Clustering analysis is a primary method for data mining. The ever increasing volumes of data in different applications forces clustering algorithms to cope with it. DBSCAN is a well-known algorithm for density-based clustering. It is both effective so it can detect arbitrary shaped clusters of dense regions and efficient especially in existence of spatial indexes to perform the neighborhood queries efficiently. In this paper we introduce a new algorithm GriDBSCAN to enhance the performance of DBSCAN using grid partitioning and merging, yielding a high performance with the advantage of high degree of parallelism. We verified the correctness of the algorithm theoretically and experimentally, studied the performance theoretically and using experiments on both real and synthetic data. It proved to run much faster than original DBSCAN. We compared the algorithm with a similar algorithm, EnhancedDBSCAN, which is also an enhancement to DBSCAN using partitioning. Experiments showed the new algorithm's superiority in performance and degree of parallelism.