# Building a self-hosted, AI-native workflow engine in Rust (180 node types, no SDK bloat)

> Source: <https://dev.to/_34d3c9ee969f97a6c811fb/building-a-self-hosted-ai-native-workflow-engine-in-rust-180-node-types-no-sdk-bloat-266b>
> Published: 2026-06-16 18:03:59+00:00

I've spent the last while building ** Trigix** — an open-source (MIT), self-hostable workflow automation platform. Think n8n, but the execution engine is in Rust and the AI nodes can run entirely against local models. This post is about a few engineering decisions I think are worth sharing, not a feature tour.

A workflow is a DAG of typed nodes. Every node is one variant of a `NodeType`

enum and one async function:

``` js
match node.node_type {
    NodeType::Http     => execute_http(node, ctx, &client).await,
    NodeType::Agent    => execute_agent(node, ctx, &client, ai_url).await,
    NodeType::Sqs      => execute_sqs(node, ctx, &client).await,
    // … ~180 arms
}
```

Adding a node type touches a fixed set of places (enum variant, executor impl, dispatch arm, a `node_type → str`

map, and the frontend palette/config panel). It's mechanical, which is the point: the cost of a new integration is bounded and reviewable, and the compiler tells you if you forgot a touch point.

The engine itself is the interesting part — async DAG scheduling with parallel fan-out/fan-in, retries, timeouts, cancellation, and sub-workflows — but the node catalog is where most of the surface area lives, so I optimized hard for "cheap to add, hard to get subtly wrong."

Most SaaS/DB/vector-store/cloud nodes are thin HTTP clients that return `{ status, body }`

. The value of a first-class node over "just use the generic HTTP node" is the curated config UI and auth handling, not raw capability. That framing kept ~150 of the nodes small and uniform.

For cloud services I leaned on caller-supplied tokens where possible (e.g. GCS, Vertex AI, BigQuery, Snowflake all take a bearer token) instead of baking in each provider's OAuth dance. Honest trade-off: less magic, but no giant SDK per provider and no credential-exchange code to maintain.

A self-hosted tool has to build everywhere with `cargo build`

, with no "first install these system libraries" footnote. That constraint drove a few choices:

**AWS (SQS/SNS/Bedrock).** Instead of pulling in the AWS SDK, I implemented **Signature V4 from scratch** with the crypto crates already in the tree (`sha2`

/`hmac`

/`hex`

). The reassuring part: AWS publishes a SigV4 test suite, so the signer is unit-tested against the canonical `get-vanilla`

vector:

```
// The signer must reproduce AWS's published signature exactly.
assert!(auth.ends_with(
  "Signature=5fa00fa31553b73ebf1942676e86291e8372ff2a2260956d9b8aae1d763fbf31"
));
```

That single test is worth more than a hundred round-trip mocks — it pins the canonical-request and signing-key derivation to a known-good answer.

**SSH/SFTP.** The obvious crate (`ssh2`

) binds to libssh2 — a system dependency that breaks the build for anyone without the dev headers (and `cmake`

). So I used **russh + russh-sftp**, a pure-Rust SSH implementation. `cargo build`

stays self-contained; password and private-key auth both work.

**SQL Server.** Same reasoning — `tiberius`

is a pure-Rust TDS driver, so the MSSQL node needs no native client.

**OCR.** This one I *couldn't* keep pure-Rust without a heavy native dependency, so instead of linking libtesseract at build time, the node shells out to the `tesseract`

CLI at runtime. The workspace still builds everywhere; OCR just needs the binary present when that node actually runs. Being explicit about that boundary beats a build that fails on a clean machine.

The theme: a system dependency at **build** time is a tax on every contributor and CI run; a dependency at **runtime** is opt-in and only paid by the people who use that feature.

The LLM/agent/RAG nodes speak the OpenAI-compatible wire format, so you can point them at a local **Ollama** or **vLLM** and run the whole AI side offline. RAG retrieval runs on your own Postgres/pgvector with hybrid (vector + full-text) search, optional reranking, and an HNSW index. Cloud providers are optional, not assumed — which matters if you're self-hosting precisely to avoid sending data out.

```
docker compose -f docker-compose.poc.yml up -d --build
```

That brings up the console/API, Postgres+pgvector, Redis, and an optional AI runtime. There's also a Helm chart on GHCR and a one-click GitHub Codespaces devcontainer if you want to poke at it before committing a VM.

**Honest status:** it's young and I'm the primary author; 600+ tests and a v1.3.0 release, but treat it accordingly. And full transparency since it's relevant on this platform — I used AI coding assistants while building it; it's a real, tested codebase, not a one-shot generated repo.

Repo: [https://github.com/bj-qizhi/trigix](https://github.com/bj-qizhi/trigix) — feedback on the engine design and the node model especially welcome.
