Proof bundle format¶
A proof bundle is a zip containing four files. It is the externally-visible deliverable and the thing that must work end-to-end — design decisions protect its integrity first.
flowchart LR
subgraph Inputs
M[MerkleRoots + OTSProof]
S["Snapshots metadata<br/>+ HMAC commitments"]
K[Ed25519 signing key]
end
AB[assemble_bundle_data] --> CJ[canonical_bundle_json]
M --> AB
S --> AB
CJ --> JSON[bundle.json]
CJ --> SG[sign_canonical_bytes]
K --> SG --> SIG[bundle.sig.json]
AB --> PDF[render_bundle_pdf] --> PDF_F[bundle.pdf]
VER[verify_template.py] --> VERF[verify.py]
JSON --> ZIP[[bundle.zip]]
SIG --> ZIP
PDF_F --> ZIP
VERF --> ZIP
Contents¶
bundle.pdf¶
Human-readable cover document. Contains:
- Signed attestation of authorship (the text the author agreed to the first time they generated a bundle).
- Writing statistics: total captures, active days, peak day, final word count, sessions.
- Word-count-over-time chart — rendered by reportlab Drawing primitives directly in the PDF, no PNG intermediate.
- Cryptographic appendix: Merkle roots, OTS receipt fingerprints, Bitcoin block heights where known.
- One page of verification instructions for the publisher.
- A
bundle_identifier_hexfooter on every page — the content hash ofbundle.json.
Zero manuscript content.
bundle.json¶
The canonical, content-addressed payload. This is what verify.py actually reads. Deterministic JSON — sorted keys, UTF-8, no trailing whitespace, stable floating-point formatting — so the same inputs always produce the same bytes, and so bundle_identifier_hex (SHA-256 of this file) stably identifies a specific bundle.
Schema (illustrative — see backend/api/bundle.py and backend/api/schemas.py for the authoritative shape):
{
"format_version": 1,
"bundle_identifier_hex": "...", // SHA-256 of the canonical bytes
"generated_at": "2026-04-22T14:00:00Z",
"author": {
"email": "...",
"attestation_text": "...", // user-facing attestation
"attestation_signed_at": "..."
},
"writing_summary": {
"capture_count": 412,
"active_days": 91,
"peak_day": { "date": "...", "word_count": 8102 },
"final_word_count": 85000,
"session_count": 128
},
"merkle_roots": [
{
"computed_at": "2026-04-09T00:00:00Z",
"root_hash_hex": "...",
"leaves_hex": ["...", "...", "..."], // HMAC commitments in order
"ots_receipt_b64": "...", // DetachedTimestampFile
"bitcoin_block_height": 891234 // null if pending
},
...
],
"snapshots": [
{
"captured_at": "...",
"plaintext_hmac_hex": "...", // = Merkle leaf
"merkle_root_hash_hex": "...", // which day this save anchors to
"word_count": 1234,
"char_count": 7890,
"file_type": "md"
},
...
]
}
Note: no path, filename, or file contents. The only manuscript-derived value is plaintext_hmac_hex, which is an HMAC under a key the server never sees.
bundle.sig.json¶
Ed25519 signature over the canonical bundle.json bytes.
{
"alg": "Ed25519",
"signature_b64": "...",
"public_key_fingerprint": "..." // SHA-256 truncated; stable across bundles
}
The public-key fingerprint is embedded so the verifier can confirm the bundle was signed by BlindProof's production signing key (rather than a substituted one). The Ed25519 private key lives as the BLINDPROOF_SIGNING_KEY Fly secret; its public key is published alongside the docs.
verify.py¶
The stdlib-only checking tool. A PEP 723 single-file Python script with exactly one external dependency: opentimestamps-client (needed to call the ots CLI for Bitcoin verification).
What it does:
- Loads
bundle.jsonandbundle.sig.json. - Recomputes
bundle_identifier_hexfrom the canonical bytes and confirms the signature using the embedded public-key fingerprint. - For each
merkle_roots[i]: re-derivesroot_hash_hexfromleaves_hexusing the same Bitcoin-style SHA-256 tree the backend used. Refuses to continue if they don't match. - If a manuscript file path is supplied, computes
HMAC-SHA256(mac_key, plaintext)over it — wait, no. The publisher doesn't havemac_key. Whatverify.pyactually does for the "this manuscript matches" check: it requires the author to pass the manuscript and the derived HMAC, or to read the HMAC from a sidecar in the bundle. The manuscript-match flow is explained in Verifying a bundle. - For each
ots_receipt_b64: writes the bytes to a temp file, invokesots verify -d <digest_hex> <tempfile>, and maps the output toPASS/PENDING/FAIL. - Prints a summary: overall verdict, plus one line per check.
It is deliberately simple — a publisher should be able to audit the script in a reasonable afternoon. Keep it stdlib-only; every added dependency weakens the durability promise.
Determinism and stability¶
The canonical JSON is deterministic: given the same backend state, regenerating a bundle for the same user produces the same bytes (same bundle_identifier_hex, same signature). This is important for a few reasons:
- A publisher can compare two bundles produced at different times and quickly see what changed (new snapshots appended; nothing rewritten).
- A republishing author can produce a fresh bundle before delivery without invalidating an earlier one.
- The fingerprint in the PDF footer lets non-technical readers match a PDF to its canonical payload at a glance.
If you change bundle.json's shape, bump format_version. verify.py must tolerate older formats for at least as long as any bundle produced under them might still be in circulation — which, given the durability promise, is "forever".
Stability caveats¶
- Ed25519 key rotation. When we rotate the signing key (V1), bundles signed by the old key must still verify. Plan is to publish a key-history document keyed by fingerprint;
verify.pyconsults it offline. Not yet implemented. - OTS receipt format evolution.
opentimestamps-clientis stable and backwards-compatible; we pin a lower bound and test against the latest. - Bitcoin. Bitcoin block headers are what OTS ultimately anchors to. The dependency surface here is "Bitcoin continues to exist and the public calendar network continues to operate". Both are outside our control and are what gives the bundle its durability.
See also¶
- Verifying a bundle — what
verify.pydoes step-by-step. - Backend internals — how the bundle's inputs are assembled.
- Design principles — why independent verifiability is non-negotiable.