Skip to Main Content
Service oriented architecture (SOA) allows multiple and heterogeneous data resources to be integrated within a single service while hiding the implementation details and formats of data resources from users of the service. However, data sources for a service are often distributed geographically and connected with long-latency networks; time and bandwidth consumption of data transportation may have an impact on the system performance. Dynamic data replication is a practical solution to this problem. By replicating data copies to appropriate sites, this approach aims to reduce time and bandwidth consumptions over networks. Existing strategies for dynamic replication are typically based on so-called single-location algorithms for identifying a single site for data replication. In this paper we discuss the issues with single-location strategies in large-scale data integration applications, and examine potential multiple-location schemes. Dynamic multiple-location replication is NP-complete in nature. We therefore transform the multiple-location problem into several classical mathematical problems with different parameter settings, for which efficient approximation algorithms exist. Experimental results indicate that unlike single-location strategies our multiple-location schemes are efficient with respect to access latency and bandwidth consumption, especially when the requesters of a data set are distributed over a large scale of locations.