Skip to Main Content
As MapReduce becomes more and more popular in data processing applications, the demand for Hadoop clusters grows. However, Hadoop is incompatible with existing cluster batch job queuing systems and requires a dedicated cluster under its full control. Hadoop also lacks support for user access control, accounting, fine-grain performance monitoring and legacy batch job processing facilities comparable to existing cluster job queuing systems, making dedicated Hadoop clusters less amenable for administrators and normal users alike with hybrid computing needs involving both MapReduce and legacy applications. As a result, getting a properly suited and sized Hadoop cluster has not been easy in organizations with existing clusters. This paper presents Cloud BATCH, a prototype solution to this problem enabling Hadoop to function as a traditional batch job queuing system with enhanced functionality for cluster resource management. With Cloud BATCH, a complete shift to Hadoop for managing an entire cluster to cater for hybrid computing needs becomes feasible.