Reproducible Research with Verdifax

Bind a cryptographic seal to your computational result and to the declared environment fingerprint that produced it. Anyone, reviewer, regulator, replicator, can independently recompute the seal and detect any tampering, omission, or drift.

What this is for

A growing share of scientific, regulatory, and AI-governance work depends on a piece of code, run by a specific person, on a specific machine, producing a specific number. The standard answers to "can I trust this result?", a Docker image, a pip freeze, an RStudio session info appendix, are useful but unsealed. None of them tell a third party: "this number sealed to this cryptographic hash, the producer declared this environment, and the transparency log recorded it at this time."

Verdifax fills that gap with a small, technology-neutral primitive: a manifest hash that binds your declared reproducibility context, runtime version, pinned dependencies, git SHA, declared random seeds, platform, and optional container image hash, into the audit bundle as Category 6.

When the orchestrator runs the same payload twice, the manifest hash is byte-identical. When it doesn't, the diff tells you which field moved. On Rekor-anchored deployments, every attestation is published to the Sigstore transparency log within seconds.

Who this is for

Academic researchers publishing computational results that reviewers or replicators will need to verify months or years later.
Regulated-research teams (clinical trials, FDA submissions, EU AI Act conformity assessments) where the audit trail must cryptographically link a model's output to a declared environment.
Reproducibility-focused organizations, research labs, journals, preprint servers, registries, that want a third-party-verifiable proof artifact attached to each computational result.
Engineering teams in finance, healthcare, or governed AI who need to retire "trust me, it ran the same" and replace it with "here is the manifest hash; recompute it yourself."

The two-language stack

We ship official SDKs in the two languages that cover the overwhelming majority of computational research:

	Python	R
Install	`pip install verdifax`	`remotes::install_github("Verdifax/verdifax-sdk-r")`
Client	`VerdifaxClient()`	`verdifax_client()`
Capture environment	`capture_environment()`	`verdifax_capture_environment()`
Attest a result	`client.attest(...)`	`verdifax_attest(client, ...)`
Prove determinism	`verify_determinism(...)`	`verdifax_verify_determinism(client, ...)`
Reference workflow	`reproducible_research.ipynb`	`reproducible-research.Rmd`

Both SDKs implement the same wire protocol against the same orchestrator, so a Python-authored attestation can be independently re-verified by an R-using auditor and vice versa. The manifest hash is the lingua franca.

What `capture_environment` records

Field	Python source	R source
`runtime_name`	hardcoded `"python"`	hardcoded `"R"`
`runtime_version`	`sys.version_info`	`R.version$major.minor`
`pinned_dependencies`	`importlib.metadata` (name + version, sorted)	`installed.packages()` (name + version, sorted)
`git_commit_sha`	`git rev-parse HEAD` (best-effort)	`git rev-parse HEAD` (best-effort)
`random_seeds`	caller-supplied dict, sorted	caller-supplied list, sorted
`platform`	`platform.system()` / `platform.machine()` → GOOS/GOARCH form	`Sys.info()` → GOOS/GOARCH form
`container_image_hash`	`/proc/self/cgroup` on Linux (best-effort)	`/proc/self/cgroup` on Linux (best-effort)

Every field is optional. Auto-detection failures silently leave the field null. The orchestrator records null as "not declared" rather than fabricating a claim, there is no path by which the orchestrator invents an environment fingerprint on the producer's behalf.

A 30-second example (Python)

import verdifax
from verdifax.research import capture_environment, verify_determinism

client = verdifax.VerdifaxClient()  # env: VERDIFAX_API_URL, VERDIFAX_API_KEY

# 1. Declare what you ran.
ctx = capture_environment(declared_seeds={"numpy": 42, "torch": 1337})

# 2. Attest a result with the environment bound in.
receipt = client.attest(
    payload="my-analysis-output",
    program_id="0" * 64,
    route_id="paper-figure-3",
    registry_record_hash="0" * 64,
    reproducibility_context=ctx,
)
print("ManifestHash:", receipt.manifest_hash)

# 3. Prove the pipeline is deterministic.
det = verify_determinism(
    client=client,
    payload="my-analysis-output",
    program_id="0" * 64,
    route_id="paper-figure-3-verify",
    registry_record_hash="0" * 64,
    reproducibility_context=ctx,
)
assert det.deterministic, det.diff.differing_fields

The R equivalent reads identically; see the R Markdown template for the full version.

What the determinism check answers

verify_determinism runs your payload through the pipeline twice and compares the resulting canonical manifest hashes. The top-level deterministic flag is grounded on manifest-hash equality, the seal of the pipeline output.

Bundle-hash differences, when surfaced in diff.differing_fields, indicate server-observed timing variation (e.g. LatencyTotalMs, EnvSnapshot.Time) and are labeled informational. They never cause deterministic to flip false on their own.

In production end-to-end testing, both Python and R SDKs have demonstrated deterministic == true with byte-identical manifest hashes across replays.

What to do with the manifest hash

Cite it. Treat the manifest hash like a DOI for the computational result, short, unambiguous, machine-checkable.
Pin it in supplementary materials. Reviewers and replicators can recompute the hash if they have your inputs and declared environment.
Bind it to the artifact. Sign your dataset or paper PDF over the manifest hash so the link between paper and result is non-repudiable.
Anchor it. On Rekor-anchored deployments, every attestation publishes to the Sigstore transparency log. Any subsequent attempt to alter or omit the record is detectable.

Reference workflows

We provide runnable, end-to-end demos in both ecosystems:

Jupyter notebook: verdifax-sdk-python/examples/reproducible_research.ipynb , fits a logistic regression with a fixed seed, declares the environment, attests the result, verifies determinism, and explains what to do with the resulting manifest hash.
R Markdown template: verdifax-sdk-r/examples/reproducible-research.Rmd , fits a GLM with the same narrative arc. Renders to a single self-contained HTML file you can link as supplementary material.

What the Manifest Hash proves, the cryptographic primitive that backs every attestation.
Independent verification, how a third party recomputes the manifest hash without trusting Verdifax's servers.
EU AI Act, Article 13, why declared reproducibility context maps directly to transparency and record- keeping obligations.