Skip to Main Content
Scientific workflows, being heterogeneous, loosely coupled, large-scale and distributed, pose new challenges on application monitoring. In this paper, we identify the main challenges for a monitoring infrastructure for grid scientific workflows. In order to address those challenges, we describe key design elements of a monitoring infrastructure facilitating interoperability between diverse monitoring and instrumentation tools. We propose a standardized taxonomy for grid workflow monitoring events. Automatic resource discovery is identified as the key problem to enable on-line monitoring of grid workflows. The solution to this problem is proposed based on a DHT infrastructure federated with monitoring services. This solution is next evaluated using a model-based performance evaluation. We also propose an ontology-based monitoring information model to describe records of workflow runs.