VibeHunt
Back to browse

falsify

Pre-register your ML accuracy claims

Visit

The tool lets developers pre‑register machine‑learning accuracy claims by creating a canonical YAML specification that includes an executable test plan, a metric, and a threshold. When the spec is locked, it is hashed with SHA‑256, preventing any later modification of the threshold without producing a new hash. The engine then runs the declared experiment in the specified environment and compares the observed result against the locked threshold, exiting with distinct codes for pass, fail, or tampered specifications.

It is intended for researchers and engineers who want reproducible, falsifiable performance reporting. By integrating with continuous‑integration pipelines, the tool can gate commits and documentation updates through a git commit‑msg hook that blocks changes contradicting the recorded verdict. The workflow follows four steps—declare, lock, run, and guard—ensuring that the threshold is set before data is examined and that any deviation is detectable.

Distinctively, the system enforces a cryptographic lock on the claim, rejects vague specifications at lock time, and provides numeric, non‑natural‑language verdicts that can be used directly in CI gating. The package is available via pip and includes a suite of tests validating its honesty and tamper detection features.

Reviews

Sign in to leave a review.

Loading reviews…

Similar apps