Skip to Main Content
Clustering in data mining, is useful to discover distribution patterns in the underlying data. ROCK is one such hierarchical clustering algorithm, which works on sampled data. We show that sequential ROCK algorithm is time consuming for large dataset. Instead, we present distributed algorithms with better performance than known algorithms. We develop a robust hierarchical clustering algorithm ROCK that employs preliminary calculations to be done at different processors. In addition to presenting detailed complexity results for DROCK we also conduct an experimental study with real life data sets to demonstrate the effectiveness of our technique.