Skip to Main Content
The Hadoop file system is a large scale distributed file system used to manage and quickly process extremely large data sets. We want to utilize Hadoop to assist with data-intensive workloads in a distributed campus grid environment. Unfortunately, the Hadoop file system is not designed to work in such an environment easily or securely. We present a solution that bridges the Chirp distributed file system to Hadoop for simple access to large data sets. Chirp layers on top of Hadoop many grid computing desirables including simple deployment without special privileges, easy access via Parrot, and strong and flexible security Access Control Lists (ACL). We discuss the challenges involved in using Hadoop on a campus grid and evaluate the performance of the combined systems.
Date of Conference: Nov. 30 2010-Dec. 3 2010