Skip to Main Content
The data placement strategy is an important issue in the scientific workflows which is devoted to reducing the data movements while placing datasets in a few data centers according to the data centers' storage capacity and the data dependency. The data placement is proved to be a NP hard problem, and several methods for this problem like K-means clustering algorithm are presented in the literatures. K-means clustering algorithm can reduce the number of data movements very well, but it may result that the datasets will be concentrated to few data centers, and so the loads of data centers greatly deviate from each other. The paper proposes a data placement strategy based on heuristic genetic algorithm to reduce data movements among the data centers while balancing the loads of data centers. The simulation results show that the proposed algorithm can effectively reduce data movements and balance the load of data centers.
Date of Conference: 17-18 Nov. 2012