By Topic

Enhancing MapReduce Using MPI and an Optimized Data Exchange Policy

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Mohamed, H. ; Comput. Vision & Multimedia Lab., Univ. of Geneva, Geneva, Switzerland ; Marchand-Maillet, S.

MapReduce is a programming model proposed by Google to simplify large-scale data processing. In contrast, the message passing interface (MPI) standard is extensively used for algorithmic parallelization, as it accommodates an efficient communication infrastructure. In the original implementation of MapReduce, the reduce function can only start processing following termination of the map function. If the map function is slow for any reason, this will affect the whole running time. In this paper, we propose MapReduce overlapping using MPI, which is an adapted structure of the MapReduce programming model for fast intensive data processing. Our implementation is based on running the map and the reduce functions concurrently in parallel by exchanging partial intermediate data between them in a pipeline fashion using MPI. At the same time, we maintain the usability and the simplicity of MapReduce. Experimental results based on two different applications (Word Count and Distributed Inverted Indexing) show a good speedup compared to the earlier versions of MapReduce such as Hadoop and the available MPI-MapReduce implementations. For word count, we are able to achieve 1.9x and 5.3x speedup comparing to Hadoop and MPI-MapReduce respectively for 53Gb of data.

Published in:

Parallel Processing Workshops (ICPPW), 2012 41st International Conference on

Date of Conference:

10-13 Sept. 2012