Skip to Main Content
With ever expanding datasets, efficient data management in grids becomes important. This paper describes Cabinet which employs two techniques for efficiently managing data in grids-a caching system and a new file staging approach called coordinated staging. The caching system is designed based on the characteristics of grid applications. Coordinated staging is based on the BitTorrent Protocol model and is specifically designed for High Throughput Computing (HTC) applications, a common use-case for grids. In coordinated staging, each site that is assigned to execute an individual job of the HTC application treats other execution sites as potential replica-stores. In our evaluation, we show that coordinated staging lowered the download time of a file by 3.85x, and increased the throughput of the download by 2.86x over the conventional approach of file transfer from a single source.