The Common Provenance Model

Provenance information is information about the history of an object, and depending on its content, it can be used to verify the quality and reliability of research results. In particular, provenance has been adopted in scientific domains to support a traceable lineage of research objects, such as biological material, data, or workflows. The purpose of provenance information is not to replace existing logging infrastructures or information systems that currently handle documentation of research objects and related processes or their metadata. Provenance information serves as long-term available information that can be typically used to assess the quality of the documented object (fitness for purpose) or to reproduce documented processes.

As the research objects are typically exchanged between organisations, each organisation can provide provenance information only about a portion of the documented research object’s life-cycle. The Common Provenance Model (CPM) is a provenance model to capture and exchange interoperable provenance information, creating a common ground  across different domains in the Life Sciences and building a harmonised understanding of provenance information.

The CPM is focused on, but not limited to, provenance of research objects, such as datasets, biological or environmental specimens, computational models, software, tools, or experimental results. The CPM supports reproducibility by providing a conceptual foundation for trustworthy provenance information, where each of its parts is generated, stored, and managed by a different organisation. In particular, the CPM defines a common horizontal framework to represent links between the provenance parts, and to represent common semantics to bind domain-specific provenance information. 

The model forms an open conceptual foundation for the ISO 23494 – Provenance information model for biological material and data series.