Skip to Main Content
Point symmetry-based clustering is an important unsupervised learning tool for recognizing symmetrical convex or non-convex shaped clusters, even in the microarray datasets. To enable fast clustering of this large data, in this article, a distributed space and time-efficient scalable parallel approach for point symmetry-based K-means algorithm has been proposed. A natural basis for analyzing gene expression data using this symmetry-based algorithm, is to group together genes with similar symmetrical patterns of expression. This new parallel implementation satisfies the quadratic reduction in timing, as well as the space and communication overhead reduction without sacrificing the quality of clustering solution. The parallel point symmetry based K-means algorithm is compared with another newly implemented parallel symmetry-based K-means and existing parallel K-means over four artificial, real-life and benchmark microarray datasets, to demonstrate its superiority,both in timing and validity.