Skip to Main Content
The MapReduce programming model simplifies the design and implementation of certain parallel algorithms. Recently, several work-groups have extended MapReduce's application domain to iterative and on-line data processing. Despite having different data access characteristics, these extensions rely on the same storage facility as the original model, but propagate data updates using additional techniques. In order to benefit from large main memories, fast data access and stronger data consistency, we propose to employ in-memory storage for extended MapReduce. In this paper, we describe the design and implementation of EMR, an in-memory framework for extended MapReduce. To illustrate the usage and performance of our framework, we present measurements of typical MapReduce applications.