With the concepts of cloud computing springing up, the researches of data mining clustering algorithm which is based on cloud computing become a research focus for scholars both at home and abroad. This article aiming at the extensive data clustering problem, using cloud computing technology, according to Hadoop platform does a deep research based on cloud computing platforms Hadoop and parallel K-means clustering algorithm. And it puts forward a kind of mass data clustering model based on Hadoop and new ideas of parallel K-means algorithm.
Published in:
Computer Science & Service System (CSSS), 2012 International Conference on
Date of Conference: 11-13 Aug. 2012