Skip to Main Content
Among scheduling algorithms of scientific workflows, the graph partitioning is a technique to minimize data transfer between nodes or clusters. However, when the graph partitioning is simply applied to a complex workflow DAG, tasks in each parallel phase are not always evenly assigned to computation nodes since the graph partitioning algorithm is not aware of edge directions that represent task dependencies. Thus, we propose a new method of task assignment based on Multi-Constraint Graph Partitioning. This method relates the dimension of weight vectors to the rank of a task phase defined by traversing the task graph. Our algorithm is implemented in the Pwrake workflow system and evaluated the performance of the Montage workflow using a computer cluster. The result shows that the file size accessed from remote nodes is reduced from 88% to 14% of the total file size accessed during the workflow and that the elapsed time is reduced by 31%.