Skip to main content

Provenance & Cryptographic Provenance

Two distinct, layered primitives — don't conflate them:

1. Content-addressed fact IDs (BLAKE2b)

A fact's identity is a BLAKE2b-128 digest — not a signature, and with no timestamp:

  • Content store (the GMP content backend): fact_id = BLAKE2b-128(content ‖ sep ‖ tenant_id)
  • Structured facts: fact_id = BLAKE2b-128(predicate ‖ subject ‖ object ‖ valid_from)

This makes identity deterministic and tenant-scoped. It is the PROVENANCE capability (SourceMeta.write_id, written_at, written_by).

2. Ed25519 signatures (CRYPTOGRAPHIC_PROVENANCE)

When WriteOptions.signing_key (a 32-byte Ed25519 seed) is supplied, the store signs the 16-byte fact_id:

signature, public_key = sign_provenance(signing_key, fact_id) # Ed25519

The signature and public key land in SourceMeta.signature / SourceMeta.public_key. Conformance alters the content and confirms the signature no longer verifies (tamper detection). Requires pip install "grafomem[crypto]".

The split

BLAKE2b addresses the fact (identity). Ed25519 signs it (authenticity). PROVENANCE gives you the first; CRYPTOGRAPHIC_PROVENANCE adds the second.

Source: src/aml/provenance.py, src/aml/backends/interface.py (verify_provenance)