Skip to Main Content
MapReduce is Google's programming model for easy development of scalable parallel applications which process huge quantity of data on many clusters. Due to its conveniency and efficiency, MapReduce is used in various applications (e.g., Web search services and online analytical processing). However, there are only few good benchmarks to evaluate MapReduce implementations by realistic testsets. In this paper, we present MRBench that is a benchmark for evaluating MapReduce systems. MRBench focuses on processing business oriented queries and concurrent data modifications. To this end, we build MRBench to deal with large volumes of relational data and execute highly complex queries. By MRBench, users can evaluate the performance of MapReduce systems while varying environmental parameters such as data size and the number of (map/reduce) tasks. Our extensive experimental results show that MRBench is a useful tool to benchmark the capability of answering critical business questions.