Skip to content

Backend internals

The backend is a Django 6 project (backend/blindproof_backend/) with the bulk of the logic in a single app (backend/api/). This page documents the models, storage abstraction, Merkle aggregation, OpenTimestamps integration, and proof-bundle assembly. See Backend API for the endpoint surface.

Data model

All in backend/api/models.py.

User

Custom Django user with email as USERNAME_FIELD. Mixes in PermissionsMixin so the Django admin's group/permission machinery works.

Field Type Notes
email EmailField(unique) Username.
argon2_salt BinaryField 16 bytes, generated at enrolment. Sent to the client so it can derive the master key.
created_at DateTimeField
attestation_signed_at DateTimeField, null Set on first POST /proof-bundle.
attestation_hash BinaryField(64), null SHA-512 of the canonical attestation text — lets us reject in-place edits later.
is_active BooleanField
is_staff BooleanField Admin-site access. Defaults False for author accounts.
is_superuser, groups, user_permissions Standard PermissionsMixin fields, used only by the admin.

AuthToken

Opaque bearer tokens (64-char unique hex string). Tokens are long-lived; there's no automatic expiry in the POC. A user may have many — one per device, broadly.

Snapshot

The core event — one row per successfully uploaded save.

Field Type Notes
user FK(User)
captured_at DateTimeField Client's wall clock at the save.
file_type CharField(16) E.g. "md", "txt".
path_ciphertext BinaryField AES-GCM ciphertext of the file path.
path_nonce BinaryField 12 bytes, fresh per upload (not per capture).
plaintext_hmac BinaryField The Merkle leaf. For commitment_scheme = "v2-per-leaf": HMAC-SHA256(HKDF(mac_key, ciphertext_ref), plaintext). For "v1-mac-key" (legacy): HMAC-SHA256(mac_key, plaintext).
ciphertext_ref CharField(64, unique) UUID-v4 hex; doubles as the blob-storage key.
commitment_scheme CharField(32) "v1-mac-key" (legacy, default for old rows) or "v2-per-leaf". Only v2 rows can carry a reveal entry in the proof bundle — see Proof bundle format §reveals.
extractor_version CharField(32, null=True) Which extractor produced the committed plaintext: "text-v1" (.md/.txt) or "docx-v1" (Word). Non-sensitive provenance, like file_type. NULL for legacy rows and clients that predate format tracking.
ciphertext_size IntegerField After encryption.
ciphertext_nonce BinaryField 12 bytes, per capture.
word_count, char_count IntegerField Computed client-side, sent in plaintext (metadata).
source_timestamp DateTimeField, null For connectors that surface an external timestamp (e.g. Google Docs revisions).
uploaded_at DateTimeField(auto_now_add) Server stamp.
merkle_root FK(MerkleRoot, null) Populated by aggregate_day.

MerkleRoot

Field Type Notes
user FK(User) Each author has their own tree — daily roots don't mix users.
root_hash BinaryField 32 bytes. The public anchor.
computed_at DateTimeField End of the UTC day the root covers.

OTSProof

Field Type Notes
merkle_root OneToOneField(MerkleRoot)
receipt_bytes BinaryField Serialized opentimestamps DetachedTimestampFile.
submitted_at DateTimeField When submit_ots_receipts last touched it.
bitcoin_block_height IntegerField, null Set by upgrade_ots_receipts once a calendar returns a Bitcoin attestation. null means "still pending".

Blob storage

A BlobStorage Protocol (backend/api/storage.py) lets the storage backend slot behind the API. The POC ships LocalBlobStorage(root) writing <root>/<ciphertext_ref>.bin; an S3/B2 implementation is trivial to add (deferred until volume storage becomes a bottleneck). Configured via the BLOB_STORAGE_ROOT setting.

Merkle aggregation

backend/api/merkle.py:

  • build_merkle(leaves: list[bytes]) -> bytes — pure SHA-256 binary tree. Pairs are hashed as sha256(left + right). On odd levels the last node is duplicated (Bitcoin-style). A single-leaf tree returns the leaf unchanged.
  • aggregate_day(user, day) — wraps all of user's snapshots captured in the given UTC day, builds the Merkle root, writes a MerkleRoot row, and links each Snapshot.merkle_root in one transaction.
  • aggregate_day also exists as a Django management command (manage.py aggregate_day --yesterday), which is what the daily workflow calls.
  • LEAF_ORDER_FIELDS / leaf_sort_key — the canonical leaf order, (captured_at, id). Both aggregate_day and assemble_bundle_data import this so the order is established in one place. See Proof bundle format §"Canonical leaf order".

The OTS lifecycle

flowchart TD
    A["Yesterday's HMAC commitments<br/>across all snapshots"] --> B[build_merkle]
    B --> C[Merkle root stored]
    C --> D["submit_ots_receipts<br/>→ 3 public calendars"]
    D --> E["Pending receipt<br/>OTSProof.receipt_bytes"]
    E -->|2-6 hours later| F["Bitcoin confirms<br/>calendar timestamp"]
    F --> G["upgrade_ots_receipts<br/>merges Bitcoin attestation"]
    G --> H["OTSProof.bitcoin_block_height set<br/>verify.py now returns PASS"]

All of this lives in backend/api/ots.py:

  • OTSSubmitter — a Protocol defining submit(digest) -> bytes and upgrade(receipt_bytes) -> (bytes, int | None).
  • FakeOTSSubmitter — deterministic, for tests and local development. Produces a receipt with a recognisable FAKE-OTS-RECEIPT-FOR-… prefix that verify.py detects and explicitly refuses to claim as anchored.
  • OpenTimestampsSubmitter — the real submitter. Nonces the digest, fans out to the three default public calendars (alice.btc, bob.btc, finney) via opentimestamps.calendar.RemoteCalendar, tolerates partial success, returns a serialised DetachedTimestampFile.
  • get_ots_submitter() — factory that reads settings.OTS_SUBMITTER (sourced from the BLIND_OTS_MODE env var; defaults to fake so no mis-configured environment can accidentally spam the public calendars).

Management commands:

  • submit_ots_receipts — walks MerkleRoots without an OTSProof and hands each to the configured submitter.
  • upgrade_ots_receipts — walks OTSProofs without a bitcoin_block_height, hits each pending attestation's calendar via get_timestamp(commitment), and if the calendar now has a Bitcoin attestation, merges it and stamps the block height.

The daily GitHub Actions workflow runs upgrade_ots_receiptsaggregate_day (yesterday) → submit_ots_receipts in that order. See Deployment & CI.

Proof bundle assembly

backend/api/bundle_builder.py is the orchestrator. It pulls:

  • assemble_bundle_data(user, *, reveals=None) — gathers the user's snapshots, merkle roots, OTS receipts, and writing stats into a single data structure. The optional reveals argument is a {ciphertext_ref: reveal_key_hex} map the client posted along with the bundle request; eligible v2 leaves get a LeafReveal entry attached to their root.
  • canonical_bundle_json(data) in backend/api/bundle.py — produces a deterministic, sorted-keys UTF-8 JSON payload. Content-addressed via bundle_identifier_hex (SHA-256 of the canonical bytes).
  • sign_canonical_bytes(bytes, key) in backend/api/signing.py — Ed25519 signature via cryptography. Produces bundle.sig.json.
  • render_bundle_pdf(data) in backend/api/pdf.py — reportlab, native Drawing primitives for the timeline chart. No PNG intermediate, no extra system libraries required.
  • verify_template.py — the stdlib-only verifier, shipped inside the zip as verify.py.

build_bundle_zip(user) bundles all four into a single zip ready to hand to the author.

The Ed25519 signing key lives as the BLINDPROOF_SIGNING_KEY Fly secret in production. Its public-key fingerprint is embedded in every bundle.sig.json so verify.py can confirm the bundle came from our signing key and was not altered after signing. Key rotation in V1; until then, the fingerprint is expected to be stable.

See Proof bundle format for the exact on-wire format of each of the four files.

Django admin

/admin/ exposes the standard Django admin over all five models (backend/api/admin.py). It's an operator tool — the author- and publisher-facing surface is the dashboard, not this.

Create an operator account with manage.py createsuperuser. The custom UserManager.create_superuser fills in a random argon2_salt automatically, since an operator account never derives a master key.

Every content-bearing or key-material field is rendered read-only and truncated to a hex prefixargon2_salt, attestation_hash, path_ciphertext, path_nonce, plaintext_hmac, ciphertext_nonce (and OTS receipt_bytes, shown only as a byte count). The admin never surfaces plaintext, paths, or raw key material, and never offers them as editable inputs — the same plaintext-never-leaves-the-client posture the rest of the backend keeps.

Admin assets are served by WhiteNoise: STORAGES["staticfiles"] uses whitenoise.storage.CompressedStaticFilesStorage, and the Docker build runs collectstatic --noinput so the CSS/JS ship in the image.

Settings of note

All in backend/blindproof_backend/settings.py. Env-driven in production, defaulted for dev:

Setting Env var Purpose
SECRET_KEY DJANGO_SECRET_KEY Django.
DEBUG DJANGO_DEBUG
ALLOWED_HOSTS DJANGO_ALLOWED_HOSTS
DB path DJANGO_DB_PATH SQLite on the Fly volume in production.
BLOB_STORAGE_ROOT BLIND_BLOB_ROOT Local-FS blob store root.
OTS mode BLIND_OTS_MODE fake or real.
Signing key BLINDPROOF_SIGNING_KEY Ed25519 secret key (hex).

Postgres migration is deferred — SQLite on a Fly volume is adequate for the POC traffic level.