A deterministic, offline compiler that turns a natural-language description of a bioinformatics analysis into a self-contained, agent-executable RO-Crate package — with a full-lifecycle conversational shell wrapped around the executing package.
The compiler classifies the intake, selects an archetype, builds a task DAG, emits a package, and an execution harness drives an agent (Claude Code, a shell script, anything callable with a package path) against the emitted DAG. The emitted package is an ECAA (Evidence-Carrying Analysis Artifact) v0.2 package — a typed RO-Crate that carries, alongside the analysis itself, the claims it supports, the evidence backing each claim, and the decision record that produced them. An embedded ECAA validator checks the package against a machine-checkable contract over those subgraphs. Emit-time validation defaults to pure-Rust JSON Schema (schema_only
) and is advisory (warn-only) on a plain emit unless ECAA_VALIDATION_BLOCK_ON_FAIL=1
; the local conformance gate (make conformance
, ECAA_CONFORMANCE_MODE=1
) runs it block-on-fail. The contract establishes machine-checkable consistency between an analysis's claims, evidence, decisions, and execution provenance — it does not establish biological validity; treat verdicts as a hygiene floor, not a quality ceiling.
| Component | Crate / dir | Role |
|---|---|---|
| Compiler | crates/{core, cli} |
|
| Classifier → DAG → emitter. Synchronous, no LLM dependency. | ||
| Conversation shim | crates/conversation |
|
| Closed tool vocabulary wraps the compiler. LLM is a UX shim only. | ||
| Chat server | crates/server |
|
Axum HTTP + SSE backend at /api/chat/* and /api/git/* . |
||
| Execution harness | crates/harness |
|
Loops an agent subprocess against ready tasks. Local / Mock / AWS / SLURM executors. |
||
| ECAA validator | crates/{ecaa-conformance, ecaa-types} + docs/ecaa-spec/ |
|
Emits + validates the ECAA conformance contract. ecaa-conformance re-exports core's public API and is the conformance harness a second implementer runs against their own packages — not an independent reimplementation of core. |
||
| Web UI | ui/ |
|
| React 18 + Vite + TypeScript chat surface. |
Current ECAA profile IRI: https://w3id.org/ecaa/v0.2
. The active spec files are in docs/ecaa-spec/:
v0.2.md
, ecaa-v0.2.ttl
, ecaa-v0.2.shacl.ttl
, and ecaa-v0.2.jsonld
.Linux x86-64 is the primary supported target. macOS works for dev. Windows requires WSL2.
sudo apt-get install -y build-essential pkg-config libssl-dev mold git curl # Debian/Ubuntu
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
source "$HOME/.cargo/env"
cargo install --locked cargo-nextest cargo-hakari
make build # Rust workspace (debug)
make install # binaries to ~/.cargo/bin
(cd ui && npm install)
make doctor
prints toolchain readiness; make help
lists targets.
Two terminals:
make dev-server
make dev-ui
Execution requires the installed harness binary.make dev-server
/make dev-ui
run from source (cargo run
/ Vite), so they work withoutmake install
. But theStart executionbutton spawns theecaa-workflow-harness
binaryby name fromPATH
(override withECAA_HARNESS_BIN_PATH
). If you skippedmake install
(Setup step 5), execution fails withfailed to spawn harness: No such file or directory
; runmake install
to fix it.
The chat surface boots in offline mode without an API key (the UI renders but assistant turns are mocked). For LLM-mediated chat:
export ECAA_ANTHROPIC_API_KEY=<your key>
make dev-server
For local production configuration, start from .env.example. Its active defaults are loopback/local and durable under
$HOME/.ecaa-workflow
; live API, eval, debug, AWS, SLURM, and external-validator knobs are commented until deliberately enabled.Smoke-test the compiler against a bundled scenario:
ecaa-workflow intake \
--input testdata/scenarios/01-bulk-rnaseq-ibd/request.md \
--output /tmp/ibd-package
ecaa-workflow dag --package /tmp/ibd-package
make test # cargo test --workspace
make test-ui # Vitest + axe a11y
make check # test + tsc --noEmit
make e2e-playwright # mocked Playwright tier
Compiler is synchronous.tokio
is allowed inserver
,conversation
, andcli
(forserve
only). Never incore
orharness
. Harness usesureq
(sync).Deterministic output. Emitted packages are byte-reproducible. UseBTreeMap
, notHashMap
. Avoid timestamps and random IDs outsideuuid_short()
.LLM as UX shim. Closed tool vocabulary (Tool::COUNT
asserted at compile time). High-impact actions are gated by deterministic server state, not LLM inference.Confirmation discipline.emit_package
returnsPreconditionFailure
unlesssession.user_confirmed == true
. The button click is a server-side action the LLM observes only viaget_session_state
.ECAA conformance. Every normal emitted package carries the eight required ECAA v0.2 sidecars:runtime/intake-conversation.jsonl
,runtime/decisions.jsonl
,runtime/validation-reports.jsonl
,runtime/proofs.jsonl
,runtime/claim-verification.json
,runtime/verifier-decisions.jsonl
,runtime/assumptions.jsonl
, andruntime/audit-proof-report.json
. There is no reduced ECAA mode switch;ECAA_ABLATE_*
flags are the only code-backed way to emit non-conformant control artifacts.
config/
is the source of truth for modalities, archetypes, atoms, compute profiles, gene panels, plot affordances, and downstream-policy contracts. config/archetypes/
and config/stage-atoms/
carry their own READMEs. Runtime environment variables are catalogued in .env.example.
User guide: USERS.md. Methods summary:
METHODS.md
CONTRIBUTING.md
docs/ecaa-spec/
Apache-2.0 — see LICENSE.