Skip to Main Content
According to the efficiency bottleneck of algorithm DBSCAN, we present P-DBSCAN, a novel parallel version of this algorithm in distributed environment. By separating the database into several parts, the computer nodes carry out clustering independently; after that, the sub-results will be aggregated into one final result. P-DBSCAN achieves good results and much better efficiency than DBSCAN. Experiments show that, P-DBSCAN accelerates more than 40% on one PC, and 60% on two PCs. In addition, the parallel algorithm has much better scalability than DBSCAN, so that it can be used for clustering analysis in huge databases.