Provenance Backbone

According to the CPM, each provenance component consists of two types of information

  1. Traversal information. The traversal information contains location and other information about preceding/current/consecutive provenance components of a chain, so it can be used as a signpost to navigate through the chain itself. The semantics of the traversal information is common for all processes documented by the CPM – it is a mapping between inputs and related outputs of a documented process. In particular, the structure of the traversal information is defined in the CPM, so that the algorithms processing a provenance chain may rely on this common underlying structure.
  2. Domain/process specific information. The domain/process specific information contains relevant details about the history of the process or object (digital or physical) described in the provenance component. The type of semantics of this specific information is always specific to the reference domain of the documented process/object. The CPM prescribes how the domain specific provenance is attached to the traversal information in a standardised way, and provides general requirements related to representation of common aspects of diverse domains (e.g., how to represent links to documentation outside finalised provenance). Provenance backbone is the part of a provenance chain that corresponds to the union of traversal information that is present in components of a provenance chain (Figure 3). The CPM prescribes the structure of the underlying provenance graph that represents the traversal information, and provenance backbone consequently. This Deliverable D6.6 describes how the CPM defines semantics of nodes and edges present on the backbone.

Provenance backbone is the part of a provenance chain that corresponds to the union of traversal information that is present in components of a provenance chain (Figure 3). The CPM prescribes the structure of the underlying provenance graph that represents the traversal information, and provenance backbone consequently.

Figure 3: Simplified schema of a provenance chain. Each component of the chain (here Bundle 1, etc.) consists of the traversal information and domain specific information. A provenance backbone is formed by traversal information present in distinct provenance components. Figure cited from [Wittner2022].