cd /news/ai-infrastructure/infrafactory Β· home β€Ί topics β€Ί ai-infrastructure β€Ί article
[ARTICLE Β· art-20535] src=github.com pub= topic=ai-infrastructure verified=true sentiment=↑ positive

InfraFactory

InfraFactory, an open-source infrastructure-as-code tool, now generates and validates OpenTofu configurations across AWS, GCP, and Scaleway using LLMs against deterministic mock servers in seconds, eliminating the need for cloud credentials or real API calls. The tool closes the slow, expensive feedback loop of hand-iterating Terraform against real clouds by running scenario YAML through a four-layer validation pipelineβ€”static analysis, mock deployment, real deployment, and destructionβ€”with structured failures fed back into subsequent LLM iterations. Users can run end-to-end infrastructure validation locally in approximately 60 seconds using three commands, with the system converging on successful deployments after an average of two LLM iterations.

read11 min publishedJun 3, 2026

Scenario-driven OpenTofu generation and validation across AWS, GCP, and Scaleway β€” generated by an LLM, validated against deterministic mock servers in seconds, optionally deployed against real cloud APIs.

Hand-iterating IaC against real cloud APIs is slow, expensive, and flaky. LLMs are good at writing terraform but bad at debugging "why didn't this apply" β€” the error messages are layers deep and the feedback loop is 90 seconds per attempt against a real cloud.

InfraFactory closes that loop. You write a scenario YAML declaring intent (resources + acceptance criteria). The pipeline generates HCL with an LLM, validates it through four layers (static β†’ mock-deploy β†’ real-deploy β†’ destruction), and feeds structured failures back into the next iteration's prompt. Subsecond mock validation, no cloud credentials required.

infrafactory run scenarios/training/gcp-pubsub.yaml

against fakegcp

: scenario YAML β†’ 3-phase LLM generation β†’ 3-layer validation β†’ AI's first iteration fails (fakegcp rejects google_project_service

) β†’ feedback fed into the next iteration's prompt β†’ second iteration converges to Status: success

. Demonstrates the feedback loop that makes the pipeline robust against partial mock coverage. Re-record with ./docs/demo/record.sh

(requires make mocks-up

  • an LLM credential in env).

Actually runs gcp-pubsub

through the UI: scenario page β†’ click Run β†’ Live page populates with iteration stages live as the AI tries to build the topic + subscription against fakegcp β†’ iteration 1 fails (fakegcp doesn't model google_project_service

yet) β†’ AI sees the feedback in iteration 2's prompt and converges β†’ success banner β†’ per-run IaC viewer shows the converged HCL with auto-injected *_custom_endpoint

overrides pointing at fakegcp. ~2min end-to-end, 2 LLM iterations. Re-record with make demo-ui-run

(needs make mocks-up

  • Claude CLI authenticated).

Browser walkthrough of full-stack-paris

(the most resource-dense scenario) β€” no infrafactory run

, just a tour of the Scenario / Runs / Compare / Pitfalls / Diagnostics pages so viewers see the UI surface (24s, no LLM credit needed). Re-record with make demo-ui

.

Three commands gets you a working LLM-driven infra pipeline against local mock servers, validates a real terraform scenario end-to-end, and tears everything down cleanly. No cloud credentials. No real cloud calls. ~60 seconds.

mkdir -p ~/dev && cd ~/dev
for repo in infrafactory fakeaws fakegcp mockway; do
  git clone https://github.com/redscaresu/$repo.git
done
cd infrafactory

make up

./bin/infrafactory run scenarios/training/block-paris.yaml --config infrafactory.yaml


make down

You should see Status: success

and run/terminal_reason: pass (target_reached)

after step 3. The LLM generated a Scaleway Block Storage volume in HCL, the static validator + mockway apply + topology test + destroy/orphan-check all passed. The default run

tears the resources down at the end of the test cycle (the scenario's destruction: no_orphans

acceptance criterion), so http://127.0.0.1:8080/mock/state

reports empty collections. To inspect the post-apply state, add --no-destroy

to the run command.

Use make status

at any time to see which of the six ports (8080

, 8081

, 8082

, 9090

, 9091

, 4173

) are listening.

  • Go 1.25+
  • OpenTofu ( https://opentofu.org) on PATH - Docker (for the SeaweedFS S3 backend used by AWS scenarios) β€” only needed when running AWS-cloud scenarios; Scaleway-only and GCP-only demos don't require it
  • An LLM credential, see below

InfraFactory drives generation through the Claude CLI by default β€” sign in with claude login

once and it works out of the box. To use a different model via OpenRouter instead, export OPENROUTER_API_KEY

and set agent.type: openrouter

in infrafactory.yaml

. Both paths hit the same 3-phase generation pipeline (plan β†’ write HCL β†’ self-review

); pick whichever fits your budget/latency profile.

Port Service Why
8080 mockway Scaleway HTTP API mock
8081 fakegcp GCP API mock
8082 fakeaws AWS API mock
9090 SeaweedFS S3-compatible backend (Docker; AWS-only scenarios)
9091 s3router (S80) HTTP shim that fans S3 traffic across SeaweedFS (data plane) and fakeaws (?publicAccessBlock subresource SeaweedFS doesn't model). infrafactory.yaml s3.url points here, not directly at SeaweedFS. See cmd/s3router/ .
4173 infrafactory UI SvelteKit dashboard + scenario runner

After make up

, any of these run against the same stack:

./bin/infrafactory run scenarios/training/gcp-full-stack.yaml      # cloud: gcp      β†’ fakegcp
./bin/infrafactory run scenarios/training/aws-full-stack.yaml      # cloud: aws      β†’ fakeaws
./bin/infrafactory run scenarios/training/full-stack-paris.yaml    # cloud: scaleway β†’ mockway

There are 39 scenarios under scenarios/training/

. Inspect generated HCL at output/<scenario>/

(overwritten each run) and immutable per-run artifacts at .infrafactory/runs/<scenario>/<run-id>/

.

A successful run ends with Status: success

and run/terminal_reason: pass (target_reached)

. If a validation layer fails, the failure JSON feeds into the next iteration's LLM prompt and the loop retries (default budget: 5 iterations).

make up

already started the UI on http://127.0.0.1:4173

. If you'd rather start just the UI (without the mocks), use make run

.

The UI provides a scenario browser (edit YAML, see real-time validation), run controls (--clean

/ --no-destroy

/ Layer-3 toggles), a live page with per-iteration timer and stage indicators, per-run IaC viewer with diffs, and a pitfalls editor. See the UI demo above for the full-stack-paris

walkthrough.

scenario YAML  ──▢  3-phase LLM generation  ──▢  3-layer validation  ──▢  retry on failure
   (intent)         plan β†’ write HCL β†’ review     static / mock / real     (5x budget)

Three-phase generation (prompts/{aws,gcp,scaleway}/phase{1,2,3}*.md

):

Plan architectureβ€” scenario YAML + T-shirt size mappings β†’ JSON architecture plan** Generate HCL**β€” architecture + cloud-specific pitfalls + provider schema β†’ OpenTofu.tf

filesSelf-reviewβ€” generated HCL β†’ 10-point checklist β†’ corrections orNO ISSUES FOUND

Three-layer validation (each gates the next):

Staticβ€”tofu init/validate/plan

  • OPAdeny

policies on the plan JSONMock deployβ€”tofu apply

against the matching mock; topology checks against/mock/state

; OPAdeny_state

policies; mock-enforced FK integrityReal deploy(optional, gated byvalidation.layers.sandbox_deploy.enabled

) β€”tofu apply

against the real cloud with auto-destroy on failure

On failure, the structured failure (layer

, stage

, check

, detail

, failure_class

) is appended to the next iteration's prompt as a <feedback>

block so the LLM sees what specifically broke.

Auto-learning loop: when an iteration self-corrects (iter N+1 succeeds after iter N failed) OR a run terminates with stuck

/repair_budget_exhausted

, the failure detail is extracted into pitfalls/<cloud>.yaml

so future runs of any scenario in that cloud see the lesson up front. Every entry is source: learned

β€” the system seeds itself from real runs and a CI ratchet (TestPitfallsNoHumanSeeding

) rejects hand-authored entries.

When a run reaches target_reached

AFTER β‰₯1 failing iteration, a second extractor (source: learned_from_diff

) diffs the failing iteration's HCL against the passing iteration's HCL and emits the minimal HCL snippet that resolved the failure β€” prescriptive guidance derived from real runs, not hand-written. This unblocked the "prompt-collapse" effort: prescriptive rules in the phase-2 prompts retired as the system's learned pitfalls replaced them. See ADR-0012

(and the N11 retirement framework in ADR-0018

) for the dynamic-pitfalls contract.

Each cloud has the same set of extension points; the scenario's cloud:

field drives every dispatch.

Extension point AWS GCP Scaleway
Mock server
fakeaws

:8082

) + SeaweedFSfor S3 (

:9090

)(fakegcp

:8081

)(mockway

:8080

)hashicorp/aws ~> 5.70

hashicorp/google >= 5.0

(v5 for IAM SA)scaleway/scaleway >= 2.50

prompts/aws/

prompts/gcp/

prompts/scaleway/

pitfalls/aws.yaml

pitfalls/gcp.yaml

pitfalls/scaleway.yaml

policies/aws/

policies/gcp/

policies/scaleway/

scenarios/training/aws-*.yaml

scenarios/training/gcp-*.yaml

scenarios/training/*-paris.yaml

aws-full-stack.yaml

gcp-full-stack.yaml

full-stack-paris.yaml

Each first-party mock is wire-shape compatible with the matching real provider, enforced by an examples/working/<svc>

smoke harness in the mock's own repo (apply β†’ plan -detailed-exitcode 0 β†’ destroy

). See each mock's README for the API-compatibility contract.

AWS S3 is the exception: bucket sub-resource reads (GetBucketPolicy / GetBucketTagging / etc.) are served by SeaweedFS instead of fakeaws's stripped-down S3 handler β€” terraform-provider-aws

's bucket Read flow needs the full management surface. SeaweedFS doesn't model ?publicAccessBlock

, so a small reverse-proxy shim (cmd/s3router/

, S80) fronts both backends: it routes ?publicAccessBlock

to fakeaws and fans PUT/DELETE /<bucket>

to both so the bucket exists in both stores. Rationale + the SeaweedFS-vs-Adobe-S3Mock-vs-Garage-vs-LocalStack evaluation is documented in CONCEPT.md under "Third-Party Mock Integration".

Adding a new cloud requires: prompt templates, pitfalls file, topology derivation rules, mock server, OPA policies, and training scenarios. Dispatch is driven by cloudMockStateRouter

, cloudConstraintPolicies

, filterPolicyPathsByCloud

, ExtractProviderSchemaForCloud

, and detectCloud

.

Command Purpose
infrafactory init --path <file>
Scaffold a new scenario YAML
infrafactory validate <scenario>
Layer 1 static checks only
infrafactory generate <scenario>
3-phase LLM generation only
infrafactory test <scenario>
Layers 1-4 (no retry loop)
infrafactory run <scenario>
Full pipeline with retry loop + holdouts
infrafactory mock start/stop/status/logs
Manage the Mockway (Scaleway) mock only. Use make mocks-up /-down /-status /-logs to manage all three (mockway/fakegcp/fakeaws).
infrafactory mock reset
Reset state across every configured mock backend (mockway + fakegcp + fakeaws + s3 cascade in one call). Use this between scenarios in sweep harnesses instead of bare curl to /mock/reset β€” only this path cascades to the SeaweedFS s3 backend.
infrafactory ui
Serve the web dashboard

Auxiliary binary (bin/n10extract

, built by make build

) drives the N10/N13 prescriptive-pitfall extractors against a recorded run directory and emits a candidate pitfalls/<cloud>.yaml

snippet on stdout β€” used by the N11 retirement protocol's step 2 when the organic learning loop hasn't fired for the target pattern. See docs/decisions/0012-dynamic-pitfalls.md

and docs/decisions/0018-n11-retirement-criteria.md

for the auto-learning architecture.

Key flags for run

: --clean

(fresh start), --no-destroy

(keep resources for incremental follow-up), --repair-iterations-max N

(retry budget, default 5).

Scenario YAML declares criteria that gate run success:

Type Layer 2 (mock) Layer 3 (real)
connectivity
Topology graph query TCP connect with retry
http_probe
Topology graph query HTTP GET, expect 2xx/3xx
dns_resolution
Auto-pass (informational) DNS A/AAAA lookup with retry
policy
OPA rules on plan + state Same
destruction
Orphan check after destroy Same + real destroy
infrafactory/
β”œβ”€β”€ cmd/infrafactory/      CLI + embedded UI build
β”œβ”€β”€ internal/
β”‚   β”œβ”€β”€ cli/               command tree, runtime wiring
β”‚   β”œβ”€β”€ config/            infrafactory.yaml 
β”‚   β”œβ”€β”€ scenario/          YAML  + JSON schema validation
β”‚   β”œβ”€β”€ generator/         3-phase LLM pipeline (Claude / OpenRouter)
β”‚   β”œβ”€β”€ harness/           static/mock/destroy primitives + provider schema extraction
β”‚   β”œβ”€β”€ feedback/          failure-signature modelling, stuck detection
β”‚   β”œβ”€β”€ runstore/          .infrafactory/runs persistence
β”‚   └── e2e/               cross-repo end-to-end tests
β”œβ”€β”€ ui/                    SvelteKit dashboard
β”œβ”€β”€ scenarios/training/    per-cloud training scenarios
β”œβ”€β”€ prompts/{aws,gcp,scaleway}/  phase 1-3 prompt templates
β”œβ”€β”€ pitfalls/{aws,gcp,scaleway}.yaml  static + learned pitfalls
β”œβ”€β”€ policies/{aws,gcp,scaleway}/  OPA rego files
β”œβ”€β”€ scenario.schema.json   scenario contract
└── infrafactory.yaml      runtime config contract

Pre-commit hook runs gitleaks + make test

(Go unit + UI unit + Playwright e2e). Wire it once:

make install-hooks

Common targets:

make test                  # full suite
make test-unit             # Go only
make ui-test-e2e           # Playwright only

make mocks-up              # start mockway + fakegcp + fakeaws (+ SeaweedFS via Docker)
make mocks-down            # stop them all
make mocks-status          # show port + PID for each (probes lsof, not just pidfiles)
make mocks-restart         # mocks-down + mocks-up; picks up sibling-repo source changes
make mockway-restart       # restart just one mock (also: fakegcp-restart, fakeaws-restart)

make mocks-up-containers   # build + start fakeaws + fakegcp + mockway
make mocks-down-containers
make mocks-pull            # refresh published GHCR images

make sweep-39

make clean-bg

When working on a sibling mock repo (../fakegcp

, ../fakeaws

, ../mockway

), make mocks-up

spins up those mocks via go run

which compiles ONCE at boot. After committing a change in the sibling repo, run make <mock>-restart

(e.g. make fakegcp-restart

) to pick up the new source β€” otherwise the running mock keeps serving the stale binary, a footgun that's wasted several debugging sessions worth of time.

Gated e2e tests (cross-repo, require tofu

  • the sibling mock repos checked out):
INFRAFACTORY_ENABLE_E2E=1 go test ./internal/e2e/...

β€” component overview and validation-layer detailsdocs/architecture.md

β€” ADRs (dynamic pitfalls, topology derivation, etc.)docs/decisions/

β€” per-scenario pass/fail snapshot + failure classificationdocs/scenario-failure-matrix.md

β€” entry point for AI agents working on this repoAGENTS.md

β€” code conventions, PR contract, quality gatesCONTRIBUTING.md

β€” disclosure policySECURITY.md

CHANGELOG.md

Apache 2.0 β€” see LICENSE.

── more in #ai-infrastructure 4 stories Β· sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain β€” perfect for shipping the agent you just read about.

$git push zahid main
β†’ Live at https://your-agent.zahid.host βœ“
Get free account β†’ Pricing
from €0/mo Β· no card required
LIVE [news/infrafactory] indexed:0 read:11min 2026-06-03 Β· β€”