GRAFOMEM — The Grafomem Memory Protocol
GRAFOMEM — the Grafomem Memory Protocol (GMP). Agent memory you can verify, not just declare.
GRAFOMEM began as a benchmark for one question — what should a standard for agent memory actually specify? — and became the answer: a benchmark, a protocol (GMP), an executable conformance suite, and a certified reference implementation.
The thesis, in one line:
Memory capabilities are orthogonal, a declared capability is not the same as observed behavior, and the only way to tell them apart is to test.
The problem
AI agents increasingly depend on persistent memory, yet there is no shared account of what a memory system must do — recall, versioning, deletion, tenant isolation — and so no neutral way to compare or certify one. Vendors benchmark themselves; buyers are told to "run your own eval."
The solution
GMP defines ten capabilities (all FROZEN in v0.2), eight metrics (M1–M8), and a conformance suite that empirically verifies — via two-sided, bootstrap-CI testing — what a backend actually does. "Supports capability X" is defined operationally: passes the conformance suite for X.
Conformance does not prove correctness; it measures it. Each direction passes only if its bootstrap confidence interval excludes the failing outcome (GMP §8.2). That distinction is the whole point of the project.
The stack
| Layer | What it is | Where |
|---|---|---|
| Benchmark | 10 workloads (W1–W10), 20 findings, locked corpus — 135 traces, 61,754 turns, 17,612 queries | src/aml/generator/, scripts/run_w*.py |
| Paper | Technical report | docs/grafomem-paper.pdf |
| Spec | GMP v0.2 — protocol semantics (RFC 2119) | docs/gmp-spec-v0.2.md |
| Conformance | executable §8: supports X ≝ passes the suite for X | src/aml/eval/conformance.py |
| Reference | in-memory + persistent SQLite stores, both at M8 = 1.000 | src/aml/backends/ |
Who is this for?
- Memory-system builders — implement the
MemoryBackendprotocol, run conformance, publish a signed report. - Agent-framework authors — target a stable, verifiable memory contract instead of a vendor SDK.
- Enterprise AI & compliance teams — independent, signed evidence of deletion and tenant-isolation guarantees.
Live right now
pip install grafomem # PyPI · Python 3.11+
- Live reference server: https://grafomem-production.up.railway.app/health
- API docs (Swagger): https://grafomem-production.up.railway.app/docs
The distribution package is grafomem (PyPI); the import package is aml. So pip install grafomem, then from aml... import ....