Skip to Main Content
Data Grids provide services and infrastructure for distributed data-intensive applications accessing massive geographically distributed datasets. An important technique to speed access in Data Grids is replication, which provides nearby data access. Much of the work on the replica placement problem has focused on average system performance and ignored quality assurance issues. In a data grid environment, resource availability, network latency, and users' requests may change. Moreover, different sites may have different service quality requirements. In this paper, we introduce a new highly distributed and decentralized replica placement algorithm for hierarchical Data Grids that determines the positions of a minimum number of replicas expected to satisfy certain quality requirements. Our placement algorithm exploits the data access history for popular data files and computes replica locations by minimizing overall replication cost (read and update) while maximizing QoS satisfaction for a given traffic pattern. The problem is formulated using dynamic programming. We assess our algorithm using OptorSim. A comparison between our algorithm and its QoS-unconstrained counterpart shows that our algorithm can shorten job execution time greatly while consuming moderate bandwidth for data transfer.