Scheduled System Maintenance:
Some services will be unavailable Sunday, March 29th through Monday, March 30th. We apologize for the inconvenience.
By Topic

K-Means for Parallel Architectures Using All-Prefix-Sum Sorting and Updating Steps

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

The purchase and pricing options are temporarily unavailable. Please try again later.
3 Author(s)
Kohlhoff, K.J. ; Dept. of Bioeng., Stanford Univ., Stanford, CA, USA ; Pande, V.S. ; Altman, R.B.

We present an implementation of parallel K-means clustering, called Kps-means, that achieves high performance with near-full occupancy compute kernels without imposing limits on the number of dimensions and data points permitted as input, thus combining flexibility with high degrees of parallelism and efficiency. As a key element to performance improvement, we introduce parallel sorting as data preprocessing and updating steps. Our final implementation for Nvidia GPUs achieves speedups of up to 200-fold over CPU reference code and of up to three orders of magnitude when compared with popular numerical software packages.

Published in:

Parallel and Distributed Systems, IEEE Transactions on  (Volume:24 ,  Issue: 8 )