The Foundations for Provenance on the Web

Provenance, i.e., the origin or source of something, is becoming an important concern, since it offers the means to verify data products, to infer their quality, and to decide whether they can be trusted. For instance, provenance enables the reproducibility of scientific results; provenance is necessary to track attribution and credit in curated databases; and, it is essential for reasoners to make trust judgements about the information they use over the Semantic Web. As the Web allows information sharing, discovery, aggregation, filtering and flow in an unprecedented manner, it also becomes difficult to identify the original source that produced information on the Web. This survey contends that provenance can and should reliably be tracked and exploited on the Web, and investigates the necessary foundations to achieve such a vision.