Skip to Main Content
Complicated method and technology have been introduced into business workflow to solve their cooperation and load balancing. However, as combination of data management and flow management, scientific workflow's load balancing is less researched. So the phenomena of conflicting scheduling and load unbalance often occur. In this paper, we analyze functional framework and load consumption of scientific workflow engine in detail, and then introduce the load models of workflow application, workflow runtime engine and grid node. More importantly, utility function quantifying various loads and performance is introduced to lay foundation for engine selection and task scheduling. The collection and evaluation of real-time status are implemented by monitor and high-layer planner. Considering load balancing of both scientific workflow engine and grid node, we propose a double-layer scheduling algorithm based on performance model. Results from experiments demonstrate that our proposed models and double-layer scheduling algorithm can well avoid load unbalancing and conflicting scheduling in engine and grid node. Moreover, completion time of scientific workflow application can reduce greatly owning to full and even utilization of engine and grid node.