Skip to Main Content
MapReduce is a state-of-the-art computation paradigm that is becoming widely used for processing large-scale datasets. Hadoop is an open-source implementation of MapReduce and follows a masterCslave architecture. This architecture makes Hadoop suffer from a single point of failure in the JobTracker. In this paper, we design a solution to resolve the single point of failure of the Job Tracker and then enhance its availability. In this solution, a standby Job Tracker is introduced to act as a hot backup node of the active Job Tracker. The standby Job Tracker synchronizes the job execution process with the active Job Tracker by collecting and parsing the job log. If the active Job Tracker fails, the standby Job Tracker can take over quickly. This solution is implemented in Hadoop 0.20.x. Extensive experiments illustrate that this solution effectively enhances the availability of Job Tracker. A big production cluster in a large e-Commerce company has adopted this solution, which avoids interrupting job submission and execution when the Job Tracker fails or restarts.