Skip to Main Content
Scientific workflow management systems (SWfMS) have been helping scientists to prototype and execute in silico experiments. They can systematically collect provenance information for the derived data products to be later queried. Despite the efforts on building a standard open provenance model (OPM), provenance is tightly coupled to SWfMS. Thus scientific workflow provenance concepts, representation and mechanisms are very heterogeneous, difficult to integrate and dependent on the SWfMS. To help comparing, integrating and analyzing scientific workflow provenance, this paper presents a taxonomy about provenance characteristics. Its classification enables computer scientists to distinguish between different perspectives of provenance and guide to a better understanding of provenance data in general. The analysis of existing approaches will assist us in managing provenance data from distributed heterogeneous workflow executions.