Skip to Main Content
Large-scale storage systems are crucial components in data-intensive applications such as search engine clusters, video-on-demand servers, sensor networks and grid computing. A storage server typically consists of a set of storage devices. In such systems, data layouts may need to be reconfigured over time for load balancing or in the event of system failure/upgrades. It is critical to migrate data to their target locations as quickly as possible to obtain the best performance. Most of the previous results on data migration assume that each storage node can perform only one data transfer at a time. A storage node, however, can typically handle multiple transfers simultaneously and this can reduce the total migration time significantly. Moreover, storage devices tend to have heterogeneous capabilities as devices may be added over time due to storage demand increase. In this paper, we consider the heterogeneous data migration problem, where we assume that each storage node v has different transfer constraint cv, which represents how many simultaneous transfers v can handle. We develop algorithms to minimize the data migration time. We show that it is possible to find an optimal migration schedule when all cvs are even. Furthermore, though the problem is NP-hard in general, we give an efficient algorithm that offers a rigorous (1 + o(1))-approximation guarantee.