Skip to Main Content
Data intensive computing is having an increasing awareness among computer science researchers. As the data size increases even faster than Moore's Law, many traditional systems are failing to cope with the extreme large volumetric datasets. In this paper we use a real world graph processing application to demonstrate the challenges from the emerging data intensive computing and present a solution with a system called Sector/Sphere that we developed in the last several years. Sector provides scalable, fault-tolerant storage using commodity computers, while Sphere supports in-storage parallel data processing with a simplified programming interface. This paper describes the rationale behind Sector/Sphere and how to use it to effectively process massive sized graphs.