Skip to Main Content
MapReduce is a programming model which is usually applied to process large-scale data. Many tasks can be implemented under the framework, such as data processing of search engines and machine learning. However, there is no efficient support for join operation in current implementations of MapReduce. Former work has studied Map-Reduce-Merge for join operator, however, because of the time cost in the Reduce phase, we argue it is better to omit the Reduce procedure along with the cost it brings for join implementation. In this paper, we design and implement a join algorithm on relational data in a MapReduce environment. Meanwhile, we present a method for join operator over many relations. We conduct a series of experiments to verify the effectiveness and efficiency of proposed methods.