Every Platform Promises "Any System." Here's Why They Don't Deliver.

Pulumi co-founder Paul Stack argues that infrastructure-as-code (IaC) tools have historically failed to deliver on their promise of supporting "any system" because their provider ecosystems rely on human-written integrations that become bottlenecks. Stack contends that traditional CRUD-based lifecycles force users into workarounds like shell scripts and null_resource blocks for operations that don't fit the model. He introduces Swamp, a new tool designed with deep extension points that allow users to add custom methods and integrations without waiting for vendor support, making it inherently compatible with agent-driven automation.

Every Platform Promises "Any System." Here's Why They Don't Deliver. Every IaC tool promised "any system." Few delivered once you tried to use it in your own business. Provider ecosystems were designed for humans who can fill in the gaps. Agent ecosystems need a different contract. That's what extensions are built for. I've made this promise myself. More than once. I travelled to conferences and talked about it because I was sure I was right. I wanted to be right. Every IaC tool promised it. Every conference keynote, every vendor deck, every README: one platform, every system, your infrastructure. I helped build some of the tools that made those promises, built providers, wrote modules, shipped integrations. Each generation of tool was genuinely better than the last, and but few if any of them truly delivered "any system" in a way that held up once you tried to use it in your own business. The tools got closest when open source communities filled the gaps. Terraform's provider ecosystem, Chef's community cookbooks, Puppet Forge. The "any system" promise was largely delivered by volunteers, not the vendor, and it worked because humans were writing providers for other humans to read and configure. It was almost impossible for the vendor to keep up with the ever evolving ecosystem. The agentic shift changes both sides of that equation. What contribution looks like is changing, and what consumers need from an extension is changing with it. If the ecosystem needs rethinking anyway, the extension model should be part of that conversation. The underlying problem is always the same: the tool works well for the systems it was built around, and everything else gets a provider that somebody wrote, somebody maintains, and somebody has to learn from scratch. That's the 200% problem I wrote about in the determinism post https://stack72.dev/deterministic-automation-for-a-probabilistic-system/ : you learn the tool's abstraction, then you learn the underlying API it models. Every new system resets the clock. But there's a deeper constraint too. Traditional IaC tools are built around a CRUD lifecycle where every resource has create, read, update, delete, and that's the contract. If you need to do something that doesn't fit, you're working around the tool rather than with it. I've written null resource blocks with local-exec provisioners that shell out to scripts that call APIs, three layers of workaround for something that should have been a single method called rotate or drain or debug . That's not a criticism of the tools themselves. Data sources and provisioners were smart adaptations, the teams behind them recognised the gap and built escape hatches to keep up with what users actually needed. But they're still escape hatches, bolted onto a lifecycle that wasn't designed for those operations. Why we built it differently I've maintained tools where the issue tracker was dominated by two kinds of request: "add support for provider X" and "add feature Y to the existing provider." Hundreds of them, thousands over the life of a tool, each one reasonable and each one blocked on a maintainer having time to write it, review it, ship it. The backlog becomes a wall between the user and the thing they're trying to automate. We didn't want swamp to work that way. The goal was never to be the team that writes every integration, it was to build extension points deep enough that we never become the bottleneck. Your infrastructure, your secrets provider, your internal APIs, your report needs, you should be able to model them without waiting for us to add support. And if an existing extension covers your domain but doesn't have the method you need, you should be able to add it yourself without forking anything. It also turns out this makes the tool work incredibly well with agents. An agent isn't constrained to whatever operations the maintainer decided to ship. If the method exists, the agent can discover it and use it. If it doesn't exist, you or the agent can add it. The agent carries out the work you instructed it to do rather than hitting a wall because the provider doesn't support that operation yet. What an extension actually is A swamp extension is a package that contains everything needed to interact with a system: model types, workflows, vaults for secret management, execution drivers, datastores, reports, and skills. Any combination, versioned together, published to a registry, pulled with a single command. The manifest looks like this: manifestVersion: 1 name: "@acme/deploy" version: "2026.06.01.1" description: "Deployment automation for Acme's infrastructure" models: - service.ts workflows: - staged-deploy.yaml dependencies: - "@swamp/aws/ec2" tags: - deployment - aws Scoped names @collective/name so you know who published it, CalVer versioning so you know when. Dependencies are bundled into the package, and every download is SHA-256 verified. TypeScript files get safety-analysed before push and after pull, with dependency trust audits running against vulnerability databases. No eval and no code injection. A model type Most extensions contain one or more model types. Here's a deployment model: js import { z } from "zod"; export const model = { type: "@acme/deploy/service", version: "2026.06.01.1", description: "Deploy a service to a target environment", globalArguments: z.object { service: z.string .describe "Service name" , region: z.enum "us-east-1", "us-west-2", "eu-west-1" , replicas: z.number .int .min 1 .max 50 , } , resources: { result: { schema: z.object { deployed: z.boolean , endpoint: z.string .url , } , }, }, checks: { "region-safety": { description: "Block deployment to unstable regions", labels: "policy" , execute: async context = { const blocked = "us-east-1a" ; if blocked.includes context.globalArgs.region { return { pass: false, errors: ${context.globalArgs.region} is currently unstable , }; } return { pass: true }; }, }, "replica-ceiling": { description: "Enforce account-level replica limit", labels: "policy" , execute: async context = { if context.globalArgs.replicas 20 { return { pass: false, errors: "Account limit is 20 replicas — raise a quota request" , }; } return { pass: true }; }, }, }, methods: { deploy: { description: "Deploy the service", arguments: z.object { image: z.string .describe "Container image tag" , } , execute: async args, context = { // actual deployment logic const handle = await context.writeResource "result", "result", { deployed: true, endpoint: https://${context.globalArgs.service}.example.com , } ; return { dataHandles: handle }; }, }, }, }; If you've read the encoding knowledge post https://stack72.dev/the-first-step-on-your-ai-journey-is-encoding-what-you-already-know/ , those checks should look familiar. The region that's been unstable since March is a pre-flight check now. The account-level replica limit someone discovered during an outage is encoded in the schema. Both run automatically before any mutating method, whether the caller is a human running a CLI command or an agent executing a workflow. The Zod schemas validate inputs at creation time so type mismatches surface immediately, and they're also discoverable. An agent can query the CLI and see what arguments a model type accepts, what methods it exposes, what the outputs look like. Those methods aren't locked to a CRUD lifecycle either. This model has deploy , but it could just as easily have debug , rollback , drain , or migrate , whatever the system actually needs. The Kubernetes debugging extensions https://swamp-club.com/extensions/@swamp/kubernetes?ref=stack72.dev have methods like checkPodHealth and validateSelectors , while the issue lifecycle extension has triage , plan , iterate , approve , which is about as far from create-read-update-delete as you can get. Extensions are also open to extension themselves. Say your security team needs every VPC creation to pass a CIDR overlap check. In the provider model, that's either running a fork of a provider or a pull request to the upstream repo, a review cycle, a release, and you're waiting weeks or months before it ships. If it ships at all, because the maintainer might not agree it belongs in the core provider. With swamp, you extend the type locally: js export const extension = { type: "@swamp/aws/ec2/vpc", methods: { "validate-cidr-policy": { description: "Check CIDR against company allocation policy", arguments: z.object {} , execute: async args, context = { // your org's CIDR validation logic }, }, } , checks: { "no-cidr-overlap": { description: "Ensure CIDR doesn't overlap existing VPCs", labels: "policy" , execute: async context = { return { pass: true }; }, }, } , }; Your check attaches to the base type and runs alongside the originals. It's enforced on the next deployment, not the next upstream release. An agent discovering the type sees everything, the methods from the registry extension and the ones your security team added, without knowing or caring which came from where. Wiring it together Workflows wire multiple model interactions into a sequence: name: staged-deploy inputs: service: type: string image: type: string jobs: - name: deploy-staging steps: - name: deploy task: type: model method modelType: "@acme/deploy/service" modelName: staging-${{ inputs.service }} methodName: deploy globalArgs: service: ${{ inputs.service }} region: eu-west-1 replicas: 2 inputs: image: ${{ inputs.image }} - name: deploy-production dependsOn: - job: deploy-staging condition: type: succeeded steps: - name: deploy task: type: model method modelType: "@acme/deploy/service" modelName: prod-${{ inputs.service }} methodName: deploy globalArgs: service: ${{ inputs.service }} region: us-east-1 replicas: 10 inputs: image: ${{ inputs.image }} Production only deploys if staging succeeded. The pre-flight checks from the model type run before each deployment. Data flows between steps through typed CEL expressions, and if a step references data that doesn't exist the expression fails loudly rather than passing blank values through. An agent creates this, but you can run it without one. That's the point I keep coming back to from the determinism post, the agent does the work of creating the automation, then the automation runs deterministically from that point on. How agents find what they need swamp model type search returns every model type in the registry with its description, arguments, and methods. swamp model type describe returns the full schema for a specific type. The agent sees what parameters exist, what constraints apply, what methods are available, all without reading docs. Extensions from trusted collectives auto-resolve on first use. When swamp encounters a type it hasn't seen before, say @swamp/aws/ec2/instance , it searches the registry, pulls the extension, loads it, and continues. The version gets pinned in a lockfile so the same resolution produces the same result next time. If it's not from a trusted collective, then it will ask you to pull the extension to use it. This is the core of why extensions look the way they do, and it's worth saying directly. A human can work from incomplete information. They'll read docs, search GitHub issues, copy examples from Stack Overflow, and eventually figure out how a provider behaves even when the documentation is wrong or missing. I've done that hundreds of times and so have you. Agents don't have that path. If a capability isn't represented in a schema, an agent has to infer it from prose, and that's a much weaker contract. Provider ecosystems were designed for humans who can fill in the gaps. Agent ecosystems need everything to be explicit, typed, and discoverable, because there's nobody in the loop to compensate when it isn't. What we use it for We ship extensions for our own infrastructure. The Kubernetes debugging extensions from the determinism post are real, model types that know how to query pod state, inspect services, validate configmap references, and check image availability. The issue lifecycle extension https://github.com/swamp-club/swamp-extensions?ref=stack72.dev drives our entire development workflow from triage through implementation. Same mechanism, completely different domains. The extensibility is one of the parts of swamp https://github.com/swamp-club/swamp?ref=stack72.dev I'm most proud of. Users model their own systems, extend what exists, and publish what they build. Nobody waits for us to get there first. The open source communities that built the provider ecosystems for the last generation of tools did extraordinary work. The next generation of contribution isn't another provider. It's operational knowledge encoded in a form an agent can discover and use.