Hadoop cluster with FPGA-based hardware accelerators for K-means clustering algorithm | IEEE Conference Publication | IEEE Xplore

Hadoop cluster with FPGA-based hardware accelerators for K-means clustering algorithm


Abstract:

In this paper, the implementation of the K-means clustering algorithm on a Hadoop cluster with FPGA-based hardware accelerators is presented. The proposed design follows ...Show More

Abstract:

In this paper, the implementation of the K-means clustering algorithm on a Hadoop cluster with FPGA-based hardware accelerators is presented. The proposed design follows MapReduce programming model and uses Hadoop distribution file system (HDFS) for storing large dataset. The proposed FPGA-based hardware accelerator for speed up the K-means clustering algorithm is implemented on Xilinx VC707 evaluation boards (EVBs). There are four computers in the proposed Hadoop cluster, one computer is Master Node, and the other three computers are Slave Nodes. The Slave Nodes communicate with VC707 EVBs through Gigabit Ethernet. The experimental results show that for clustering 125 million three-dimensional input dataset, the proposed design can achieve 4× speedup than the Hadoop cluster without FPGA-based hardware accelerators.
Date of Conference: 12-14 June 2017
Date Added to IEEE Xplore: 27 July 2017
ISBN Information:
Conference Location: Taipei, Taiwan

I. Introduction

Nowadays, big data analytics has become the focus of research in the various fields of government, manufacturing, healthcare, media, and science because of the growing popularity of internet of things (IoT). In fact, all of the industries have to confront the issues of big data analytics. The properties of big data include variety, volume, velocity and value. The “4Vs” is widely applied to the definition of big data [1]. Due to these properties, big data is not easy to be analyzed using a single computer. Many software framework developments with high scalability and fault tolerance, for example, Hadoop and Spark, are provided to handle massive data.

Contact IEEE to Subscribe

References

References is not available for this document.