MapReduce is a simple and flexible parallel programming model proposed by Google for large-scale distributed data processing. In this paper, we present a design and prototype implementation of MapReduce for the Cell Broadband Engine® Architecture (CBEA). The MapReduce model provides a simple machine abstraction that shields users from parallelization and other distributed programming complications. The goal of this paper is to describe the tradeoffs in the design of the runtime and demonstrate the potential for high performance. We study the basic characteristics of the MapReduce model and identify three types of MapReduce applications: map dominated, partition dominated, and sort dominated. We evaluate our runtime performance, scalability, and efficiency for microbenchmarks representing each of these application types as well as for complete applications. We find that map-dominated applications map well to the CBEA and that our prototype sustains high performance on these applications. For partition-dominated and sort-dominated applications, we analyze runtime performance, identify sources of inefficiency, and propose several future enhancements to significantly improve performance. Overall, we find that the simplicity and efficiency of the model make it an attractive tool for programming Cell Broadband Engine processor-based platforms.
Note: The Institute of Electrical and Electronics Engineers, Incorporated is distributing this Article with permission of the International Business Machines Corporation (IBM) who is the exclusive owner. The recipient of this Article may not assign, sublicense, lease, rent or otherwise transfer, reproduce, prepare derivative works, publicly display or perform, or distribute the Article.