Skip to Main Content
Large-scale data-intensive streaming applications in various science fields feature complex DAG-structured workflows comprised of distributed computing modules with intricate inter-module dependencies. Supporting such workflows in high-performance network environments and optimizing their throughput are crucial to collaborative scientific exploration and discovery. We formulate workflow mapping as a frame rate optimization problem and propose an efficient heuristic solution, which is integrated into the Condor-based Scientific Workflow Automation and Management Platform (SWAMP) in place of Condor's default mapping scheme. The SWAMP system is also augmented with several new components to improve the workflow management process. The performance superiority of the proposed solution is verified using both simulations and a real-life scientific workflow for climate modeling deployed in a distributed heterogeneous network environment.