Skip to Main Content
Data provenance, also called data lineage, records the derivation history of a data product. In the earth science domain, geospatial data provenance is important because it plays a significant role in data quality and usability evaluation, data trail audition, workflow replication, and product reproducibility. The generation of the geospatial provenance metadata is usually coupled with the execution of geo-processing workflow. Their symbiotic relationship makes them complementary to each other and promises great benefit once they are integrated. However, the heterogeneity of data and computing resources in the distributed environment constructed under the service-oriented architecture (SOA) brings a great challenge to resource integration. Specifically, the issues, such as the lack of interoperability and compatibility among provenance metadata models and between provenance and workflow, create obstacles for the integration of provenance, and geo-processing workflow. In order to tackle these issues, on one hand, this paper breaks the provenance heterogeneity through recording provenance information in a standard lineage model defined in ISO 19115:2003 and ISO 19115-2:2009 standards. On the other hand, this paper bridges the gap between provenance and geo-processing workflow through extending both workflow language and service interface, making it possible for the automatic capture of provenance information in the geospatial web service environment. The proposed method is implemented in the GeoBrain, a SOA-based geospatial web service system. The testing result from implementation shows that the geospatial provenance information is successfully captured throughout the life cycle of geo-processing workflows and properly recorded in the ISO standard lineage model.
Geoscience and Remote Sensing, IEEE Transactions on (Volume:51 , Issue: 11 )
Date of Publication: Nov. 2013