By Topic

Can Parallel Replication Benefit Hadoop Distributed File System for High Performance Interconnects?

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Nusrat S. Islam ; Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA ; Xiaoyi Lu ; Md. Wasi-ur-Rahman ; Dhabaleswar K. Panda

The Hadoop Distributed File System (HDFS) is a popular choice for Big Data applications due to its reliability and fault-tolerance. HDFS provides fault-tolerance and availability guarantee by replicating each data block to multiple DataN-odes. The current implementation of HDFS in Apache Hadoop performs replication in a pipelined fashion resulting in higher replication times. Such large replication times adversely impact the performance of real-time, latency-sensitive applications. In this paper, we propose an alternative parallel replication scheme applicable to both the socket-based design of HDFS and the RDMA-based design of HDFS over InfiniBand. We analyze the challenges and issues in parallel replication and compare its performance with the existing pipelined replication scheme in HDFS over 1 GigE, IPoIB (IP over InfiniBand), 10 GigE and RDMA (Remote Direct Memory Access) over InfiniBand. Experiments performed over high performance networks (IPoIB, 10 GigE, and IB) show that the proposed parallel replication scheme is able to outperform the default pipelined design for a variety of benchmarks. We observe up to a 16% reduction in the execution time of the TeraGen benchmark. We are also able to increase the throughput reported by the TestDFSIO benchmark by up to 12%. The proposed parallel replication is also able to enhance the HBase Put operation performance by 17%. However, for lower performance networks like 1GigE and smaller data sizes, parallel replication does not benefit the performance.

Published in:

2013 IEEE 21st Annual Symposium on High-Performance Interconnects

Date of Conference:

21-23 Aug. 2013