Skip to Main Content
MapReduce has become the dominant programming model in a cloud-based data processing environment, such as Hadoop. First In First Out (FIFO) is the default job scheduling policy of Hadoop, but it cannot guarantee that the job will be completed by a specific deadline. Research has been focused on developing deadline-based MapReduce schedulers by using the non-preemptive scheduling approach. However, compared with the non-preemptive scheduling approach, the preemptive scheduling approach has some advantages, such as the total completion time and slot utilization. In this paper, we first formulated the preemptive scheduling problem under deadline constraint, and then we proposed preemptive scheduling algorithms. To our knowledge we implemented the first real preemptive job scheduler to meet deadlines on Hadoop. The experimental results indicate that the preemptive scheduling approach is promising, which is more efficient than the non-preemptive one for executing jobs under a certain deadline.