Skip to Main Content
The fuzzy c-means clustering is a well-known unsupervised algorithm and has been widely used in various pattern recognition applications. As the amount of data increase, however, the basic serial implementation becomes overwhelmed. This is the main motivation for utilizing the computational power of parallel machines to speed up the c-means algorithm. We present an algorithm that exploits the mathematical equations in c-means to create building blocks based on linear algebra functions that are optimized for most available parallel architectures. We implemented our algorithm on both GPU (using CUDA and CUBLAS) and MPI (using MPI4py and NumPy), then evaluated their performance and scalability. Experiments show that our implementation outperforms all of available GPU implementations of c-means have been proposed so far.