{"slug": "show-hn-avera-a-deterministic-check-that-proves-no-regression-was-introduced", "title": "Show HN: Avera – a deterministic check that proves no regression was introduced", "summary": "A developer released Avera, an open-source tool that deterministically detects introduced regressions by comparing baseline and current test runs, blocking releases only when a previously passing test fails. The tool provides a tamper-evident evidence trail and supports multiple safety-critical domains like automotive and medical, aiming to prevent regressions from slipping through green CI pipelines, especially with AI-generated code.", "body_md": "**A deterministic regression gate for code changes.** Green CI proves nothing *failed* — AVERA proves nothing *regressed*.\n\nAVERA compares a baseline test run against the current one and blocks a release only when there is\n\nproof of an introduced regression— a test that passed before and fails now — with a tamper-evident evidence trail behind the verdict. Local-first, deterministic,no LLM in the decision.\n\nInstall from source (AVERA is not yet on PyPI), then point it at two JUnit files — verdict + gate out, no project setup, no requirements file:\n\n```\ngit clone https://github.com/tc7kxsszs5-cloud/avera && cd avera\npip install -e .\n\navera check --baseline main.xml --current pr.xml\n#\n# AVERA Check\n# Verdict:  confirmed_regression\n# Introduced failures (1): pkg.tests.test_thing\n# Gate [general.v1]: block         (exit 1 — fails the CI step)\n```\n\nWorks with anything that emits **JUnit / xUnit XML** (pytest, jest, go test, JUnit, …). Add `--json`\n\nfor machines; the exit code drops into any pipeline.\n\nAVERA ships a public **blind-replay benchmark** of real regressions. AVERA is given *only* the before/after test results — no hint where the bug is — and must catch it.\n\n``` php\nAVERA_PY=python3 ./benchmark/reproduce.sh\n# PASS  toolz-f0831e7  -> confirmed_regression / block\n```\n\nThat case is commit `f0831e7`\n\nin the real [ pytoolz/toolz](https://github.com/pytoolz/toolz) library (later reverted in PR #551). Given only the two result sets, AVERA independently identified the introduced failure (\n\n`test_isiterable`\n\n, pass→fail), ruled `confirmed_regression`\n\n, and returned `gate=block`\n\nunder every domain policy. See [— and add your own case.](/tc7kxsszs5-cloud/avera/blob/main/benchmark)\n\n`benchmark/`\n\nA passing CI run only proves *no expressed test failed* — not that nothing regressed. When prod breaks after a green merge, there is no machine-checkable record of **what** regressed or **why the merge was allowed**; teams reconstruct it by hand after an incident.\n\nWith AI agents now generating PRs faster than anyone can review them, \"the suite was green\" and \"that test is just flaky\" are exactly how genuine pass→fail regressions slip through. AVERA gives the reviewer a deterministic separator — **proven introduced regression vs everything else** — and a tamper-evident trail behind every gate decision.\n\nStated plainly, because overclaiming is the failure mode this project avoids:\n\n- It does\n**not** catch a regression that**no test exercises**— that needs fault-injection / mutation analysis, not the gate. - It does\n**not** decide**flaky vs real**— that stays a human call. - It does\n**not** decide your release — it produces auditable evidence; a human signs off. No LLM in the decision path. - It is\n**not** a certified/qualified tool. Its output is designed to be**independently re-checkable** by a human (inspectable manifest, hash-chained audit, re-derivable integrity root).\n\nThe same deterministic engine, calibrated per domain via policy-as-data. Verdict assignment is a [proven-total decision table](/tc7kxsszs5-cloud/avera/blob/main/docs/AVERA_VERDICT_SPECIFICATION.md).\n\n| Domain | Standard | Status |\n|---|---|---|\n| Software / CI / DevOps | plain pass/fail CI, AI-PR triage | ✅ |\n| Automotive (ADAS, BMS) | ISO 26262 | ✅ |\n| Aviation (avionics) | DO-178C | ✅ |\n| Railway (signaling, control) | CENELEC EN 50128 | ✅ |\n| Medical devices | IEC 62304 / ISO 14971 | ✅ |\n| Space / flight software | NASA NPR 7150.2 / NASA-STD-8739.8 | ✅ |\n\nPick a policy with `--policy <name>`\n\n(`general`\n\n, `automotive`\n\n, `aviation`\n\n, `railway`\n\n, `medical`\n\n, `space`\n\n, `ai_agent`\n\n).\n\n**Zero-config check**—`avera check`\n\n(two JUnit files → verdict + gate), for plain pass/fail CI.**Regression triage**— baseline vs current comparison; fail-closed classification (unknown status → treated as failure, never hidden).** Deterministic gate**— policy-as-data per domain; same inputs → same verdict → same evidence root, on any machine.** Evidence manifest**— content-addressed`integrity_root`\n\nbinding the whole artifact set.**Immutable audit log**— SHA-256 hash-chained, with an optional keyed (HMAC) tamper-evident mode.** Sign-off**— bound to the manifest root; fails closed if verification is skipped.** Requirement coverage proof**— traceable from change → test → requirement (regulated domains).** REST API & GitHub Action**— for CI/CD integration (see below).\n\n```\nsrc/avera/\n├── adapters/   — artifact format adapters (JUnit, CSV, simulation, logs, CANoe)\n├── compare/    — baseline vs current comparison (fail-closed status taxonomy)\n├── classify/   — regression classification + proven-total verdict spec\n├── gates/      — deterministic gate, policy-as-data (policies/*.json)\n├── evidence/   — content-addressed evidence manifest (integrity_root)\n├── audit/      — hash-chained SHA-256 audit log (optional keyed HMAC)\n├── signoff/    — sign-off state machine bound to the manifest root\n├── domains/    — per-domain profiles (avionics, powertrain, space, …)\n├── mutation/   — fault-injection / mutation-based confidence lens\n└── api/        — FastAPI REST endpoint\n\nbenchmark/      — public blind-replay regression benchmark (reproduce.sh)\nfixtures/       — reference scenarios across domains\ndocs/           — verdict spec, hardening report, dev principles, GTM\ntests/          — unit + cross-domain fixtures + exhaustive verdict-spec proof\ngit clone https://github.com/tc7kxsszs5-cloud/avera\ncd avera\npip install -e \".[demo]\"\n\n# Run the live demo shell\n./start_demo.sh                      # → http://localhost:8501\n\n# Or analyze a full evidence pack\navera analyze --project fixtures/bms-fast-charge --out reports\n```\n\nOr try the **hosted demo preview** — no install:\n👉 [https://avera-production.up.railway.app](https://avera-production.up.railway.app)\n(Read-only preview of the Streamlit shell — not full self-service.)\n\nAVERA ships as a reusable GitHub Action, in two modes.\n\n**Zero-config** — gate plain pass/fail CI with two JUnit files, no evidence pack:\n\n```\n# .github/workflows/avera-verify.yml\nname: AVERA\non: [pull_request]\n\njobs:\n  verify:\n    runs-on: ubuntu-latest\n    steps:\n      - uses: tc7kxsszs5-cloud/avera@v1\n        with:\n          baseline: main-junit.xml   # known-good results (e.g. from main)\n          current: pr-junit.xml      # this PR's results\n          policy: general            # or space / automotive / aviation / …\n      # The job fails when the gate blocks (a confirmed regression).\n```\n\n**Full evidence pack** — the canonical artifact set for regulated review:\n\n```\n      - uses: actions/checkout@v4\n      - uses: tc7kxsszs5-cloud/avera@v1\n        with:\n          project_path: evidence/my-change\n          fail_on_release_blocking: 'true'\n```\n\n**Inputs:** `project_path`\n\n(required), `output_path`\n\n, `policy`\n\n, `fail_on_release_blocking`\n\n, `fail_on_regression`\n\n, `expected_verdict`\n\n.\n**Outputs:** `verdict`\n\n, `risk`\n\n, `confidence`\n\n, `gate_status`\n\n, `report_path`\n\n, `manifest_path`\n\n, `integrity_root`\n\n, `audit_log_path`\n\n.\n\nExamples: [ examples/github-action-usage.yml](/tc7kxsszs5-cloud/avera/blob/main/examples/github-action-usage.yml),\n\n[.](/tc7kxsszs5-cloud/avera/blob/main/examples/github-action-minimal.yml)\n\n`examples/github-action-minimal.yml`\n\nServed with `uvicorn avera_api.main:app`\n\n.\n\n```\nuvicorn avera_api.main:app --host 0.0.0.0 --port 8000\n\n# Full canonical artifact set + deterministic gate status + integrity_root\ncurl -X POST http://localhost:8000/evidence-pack \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"project\": \"fixtures/bms-fast-charge\", \"policy\": \"automotive\"}'\n```\n\n`/evidence-pack`\n\nreturns `verdict`\n\n, `risk`\n\n, `confidence`\n\n, the deterministic `gate_status`\n\n, the evidence-manifest `integrity_root`\n\n, a decision summary, and the on-disk paths of every canonical artifact.\n\n```\ndocker pull ghcr.io/tc7kxsszs5-cloud/avera-cli:latest\ndocker run --rm \\\n  -v \"$PWD/fixtures/bms-fast-charge:/workspace\" \\\n  -v \"$PWD/reports:/reports\" \\\n  ghcr.io/tc7kxsszs5-cloud/avera-cli:latest \\\n  analyze --project /workspace --out /reports --memory /reports/avera-memory.jsonl\n```\n\nMulti-arch (`linux/amd64`\n\n, `linux/arm64`\n\n). Pinned tags: `latest`\n\n, `vX.Y.Z`\n\n, `sha-<short>`\n\n.\n\nLooking for engineering teams — running ordinary CI, or in automotive, aviation, railway, medical, or space — who want a narrow pilot with their own artifacts.\n\n**The pilot is simple:** one software change · one artifact family you already export · one 2-week review session. No infrastructure changes, no process disruption.\n\n📩 Contact: [mgaloyan79@gmail.com](mailto:mgaloyan79@gmail.com) · 🔗 Demo: [avera-production.up.railway.app](https://avera-production.up.railway.app)\n\nApache 2.0 — see [LICENSE](/tc7kxsszs5-cloud/avera/blob/main/LICENSE)\n\n*AVERA Engineering — engineering truth, preserved as evidence.*", "url": "https://wpnews.pro/news/show-hn-avera-a-deterministic-check-that-proves-no-regression-was-introduced", "canonical_source": "https://github.com/tc7kxsszs5-cloud/avera", "published_at": "2026-06-19 08:16:32+00:00", "updated_at": "2026-06-19 08:31:43.880623+00:00", "lang": "en", "topics": ["developer-tools", "ai-safety", "ai-agents"], "entities": ["Avera", "pytoolz/toolz", "PyPI", "JUnit", "xUnit", "pytest", "jest", "go test"], "alternates": {"html": "https://wpnews.pro/news/show-hn-avera-a-deterministic-check-that-proves-no-regression-was-introduced", "markdown": "https://wpnews.pro/news/show-hn-avera-a-deterministic-check-that-proves-no-regression-was-introduced.md", "text": "https://wpnews.pro/news/show-hn-avera-a-deterministic-check-that-proves-no-regression-was-introduced.txt", "jsonld": "https://wpnews.pro/news/show-hn-avera-a-deterministic-check-that-proves-no-regression-was-introduced.jsonld"}}