Show HN: Agentic Data Engineering

A developer introduces agentic data engineering, a practice where autonomous AI agents design, build, and maintain data pipelines from natural-language intent, shifting from manual coding to specifying goals and reviewing results. The approach promises efficiency but raises trust and governance challenges for production data.

On this page What is agentic data engineering? what-is-agentic-data-engineering Agentic vs. traditional data engineering — and vs. automation and copilots agentic-vs-traditional-data-engineering--and-vs-automation-and-copilots How agentic data engineering works how-agentic-data-engineering-works Why the model isn't the bottleneck — it's the harness why-the-model-isnt-the-bottleneck--its-the-harness What it looks like in practice what-it-looks-like-in-practice The trust problem: governing an AI agent on production data the-trust-problem-governing-an-ai-agent-on-production-data Will AI agents replace data engineers? will-ai-agents-replace-data-engineers What tools power agentic data engineering what-tools-power-agentic-data-engineering Where agentic data engineering is headed where-agentic-data-engineering-is-headed Getting started getting-started The project has been on the roadmap for two quarters. Cohort retention. Lead scoring. LTV by channel. Every week without it costs something concrete: a board question you can't answer, a campaign you can't attribute, a churn signal you caught too late. So you did what any technical operator would do in 2026 — you opened ChatGPT, or Claude, and asked it to write the SQL. And it did. Beautifully. The queries ran, the numbers came back, the charts looked clean. Then someone asked whether the retention number was actually right — and you couldn't say. No lineage, no tests, no agreed definition of what "active" even meant. Just fluent SQL nobody had verified. Looking great and being right, it turns out, are very different things. That gap — between an AI that can write data code and an AI you can trust to ship it — is the whole story of agentic data engineering . This guide explains what the term means, how the workflow actually works, the tools that make it possible, and the part most vendors skip: how you let an agent touch production data without getting burned. What is agentic data engineering? what-is-agentic-data-engineering Agentic data engineering is the practice of using autonomous AI agents to design, build, and maintain data pipelines from natural-language intent — instead of an engineer writing every transformation by hand, and with limited human oversight.The agent plans the work, writes the code ingestion, SQL, tests , runs it, checks the result, and corrects itself; a human reviews and approves the final change. The key word is agentic . A plain AI assistant answers a question and stops. An agent works toward a goal across many steps on its own — it perceives the state of your data, reasons about what to do next, takes an action, reads the outcome, and loops until the goal is met. Researchers call this the perceive → reason → act → learn loop. In data engineering, that loop looks like: explore the warehouse, write a transformation, run the tests, read the failures, fix them, and present the finished change for review. This is the shift from doing the "how" by hand to specifying the "what" and reviewing the result. You stop writing every line of SQL and start describing the metric you need — then the agent does the building. The promise is real, but so is the catch, which is the rest of this article. Agentic vs. traditional data engineering — and vs. automation and copilots agentic-vs-traditional-data-engineering--and-vs-automation-and-copilots Three things get confused with agentic data engineering. They're not the same. | What it does | Who decides the steps | | |---|---|---| Static automation cron, Airflow DAGs | Runs a fixed sequence someone wrote in advance | A human, ahead of time | AI copilot autocomplete in your editor | Suggests the next line or block while you drive | A human, line by line | AI agent | Pursues a goal across many steps, adapts to what it finds | The agent, within your guardrails | Traditional data engineering | A person hand-builds each pipeline, query, and test | A human, step by step | A scheduler repeats what you already decided. A copilot autocompletes while you stay in control. An agent takes a goal — "build me a weekly cohort-retention model" — and figures out the steps itself, including the ones you didn't anticipate. That autonomy is what makes it powerful, and exactly why the controls around it matter so much. One more term to untangle: agentic analytics . The two are siblings, not synonyms. Agentic analytics works on the serving side — it asks questions of data that already exists the BI and query layer . Agentic data engineering works one layer down: it builds and maintains the pipelines and models that produce that data in the first place. You need the engineering layer to be sound before the analytics layer can be trusted. How agentic data engineering works how-agentic-data-engineering-works Under the hood, an agentic workflow runs your raw data through the same stages a human data team would — building an AI data pipeline driven by intent instead of tickets: Ingestion. Source data lands in your warehouse from your apps, CRM, product database, and third-party tools. Connectors like Airbyte or Meltano handle the extract-and-load so the agent has raw tables to work from. Transformation. The agent writes the models that turn raw tables into clean, business-ready ones — typically as dbt https://www.getdbt.com/ models in a layered bronze → silver → gold structure, with tests attached. Semantic layer. Cleaned tables still don't know what your business means by "active user" or "qualified lead." A semantic layer encodes those definitions once, so every query — human or agent — uses the same math. We go deep on this in what a semantic layer is and why it matters /blog/what-is-a-semantic-layer . Serving. The finished metrics are queried by dashboards, notebooks, or — increasingly — by other AI agents over a protocol like MCP the Model Context Protocol , which lets an agent ask your data questions in a structured, governed way. You describe the metric; the agent explores the lakehouse, writes the dbt model, builds the semantic overlay, and runs the tests. That's the happy path. Now the part that decides whether any of it is trustworthy. Why the model isn't the bottleneck — it's the harness why-the-model-isnt-the-bottleneck--its-the-harness If you've already pointed an AI coding agent at a data problem and watched it produce confident garbage, you've met the real bottleneck. It isn't the model. It's the harness the model works against. So what is a harness? A harness is the software layer around an AI model that makes its output safe to ship in production: the grounding that tells the agent which answer is right — your lineage, business semantics, and access policies — and the controls that catch a wrong answer before it lands — validation loops, data contracts, CI/CD, and an audit trail. A model writes code; a harness decides whether that code is safe to merge. The intuition is simple: the model is only the engine. Impressive on a workbench, but useless until it's bolted into the rest of the car — a chassis, wheels, a steering wheel, pedals to control the power, and a dashboard that shows what's actually happening. The harness is the rest of the car. Put a generic agent and an agent on a harness side by side: | Generic AI agent | Agent on a data harness | | |---|---|---| | Starts from | A blank file — no schema, no definitions | Your schema, dbt models, and business definitions | | Picks the right answer by | Guessing — fluently | Grounding: your lineage, semantics, and access policies | | Catches a wrong answer with | Nothing; it ships | Validation loops, PK/unique checks, data contracts, CI/CD | | Mistakes surface | In production, in a downstream dashboard | At review time, as a pull-request diff | | In production it acts | Unsupervised | Scoped, time-bound, human-in-the-loop, fully audit-logged | That gap is measurable. Snowflake put a number on it https://www.snowflake.com/en/blog/engineering/cortex-analyst-text-to-sql-accuracy-bi/ : on text-to-SQL — natural language to SQL — a general-purpose model GPT-4o scored just 51% on their internal evaluation, while grounding the same task in a governed semantic model pushed accuracy past 90% on real-world queries — nearly 2× single-shot GPT-4o. The difference between a wrong query and a right one is almost entirely context, not capability. That's also why data readiness /blog/data-readiness-for-machine-learning — clean, tested, well-defined inputs — matters more than which model you pick. The market already senses this. In Cleanlab's 2025 survey https://cleanlab.ai/ai-agents-in-production-2025/ of 1,837 engineering leaders, only ~5.2% run AI agents in production. dbt's 2026 report https://www.getdbt.com/resources/state-of-analytics-engineering-2026 found 72% of practitioners want AI-assisted coding but only 24% trust it to manage pipelines. METR's 2025 study https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/ even measured experienced developers running 19% slower with AI on familiar code while feeling faster. The appetite is real; the reliability isn't — because the harness isn't there. One principle decides everything downstream: the agent fails at review time, not in production. What it looks like in practice what-it-looks-like-in-practice Theory is cheap; here's a concrete worked example. At RevOS we built exactly this harness, so the abstract pieces above have real names. What you install is a packaged offering — a set of APIs, a command-line interface, documentation, and curated agentic skills, wrapped in a dev container you open in your IDE Visual Studio Code or similar and drive with Claude Code or a coding agent of your choice . There's no new UI to learn; it lives where you already write code. Under that surface, RevOS wires together a best-of-breed stack — automated data ingestion, dbt for transformation, Cube.dev for the semantic layer, Git for versioning, BigQuery as the warehouse — so your agent starts with your schema, your models, and your definitions instead of a blank file. You describe the metric you need; the agent explores your lakehouse, writes the dbt model, and builds the semantic overlay. For the full product view, see how RevOS helps you build a revenue or growth data layer without hiring a data engineer /product/agentic-data-engineering . Then the part that earns trust: when the agent finishes a model, the harness doesn't accept it on faith. Validation loops run. Primary-key and unique-constraint checks fire. YAML data-contract enforcement kicks in. The change moves through the same Git CI/CD pipeline as your application code — and lands on your desk as a pull request with a diff. When the agent gets it wrong and it will , you catch it in the PR, not at 3 a.m. in a downstream dashboard. The trust problem: governing an AI agent on production data the-trust-problem-governing-an-ai-agent-on-production-data The single hardest question in agentic data engineering is the one most marketing pages dodge: how do you let an autonomous agent near production without it doing something irreversible? The answer is workflow, not faith. Three controls do the heavy lifting: Changes ship as pull requests. The agent never writes directly to production. It proposes a diff that runs through tests and CI; a human reads it and merges it. This is the concrete meaning of "fails at review time." A wrong model is a red check on a PR, not a corrupted table. Permissions are scoped and time-bound. The agent gets exactly the access a task needs, for as long as the task takes — not standing admin rights. Mutating actions in production schema changes, deletions, permission grants keep a human in the loop by design. Every action is audit-logged. An immutable trail of what the agent did, when, and why means you can answer "what changed?" after the fact — the difference between a controlled system and a black box. Why be this strict? Because verbal guardrails don't bind an agent — technical ones do. The widely reported case of an AI agent wiping a live production database during a code freeze is the cautionary tale: it had every capability to help and none of the controls to be safe. Autonomy should scale as trust accrues , never the other way around. Will AI agents replace data engineers? will-ai-agents-replace-data-engineers Short answer: no — they change the job. The "AI data engineer" worth imagining is a tooling shift, not a headcount replacement. The work that agents absorb is the repetitive build work: boilerplate models, test scaffolding, documentation, the tenth slightly-different staging table. What they don't absorb is judgment — deciding what a metric should mean, whether a result is plausible, and what the agent is allowed to touch. So the role moves up the stack. Less time typing SQL; more time defining intent, reviewing the agent's pull requests, and owning the semantic layer and governance that keep the agent correct. The scarce, valuable skill becomes knowing what "right" looks like — which is exactly the skill a generic model lacks. Put simply, an AI data engineer doesn't replace the human one; it hands the human a faster, more reviewable way to work. For the bigger picture of how AI reshapes data work, see our take on modern data strategy in the age of AI /blog/modern-data-strategy-age-of-ai . What tools power agentic data engineering what-tools-power-agentic-data-engineering There's no single product called "agentic data engineering." It's a stack, and you can assemble it from best-of-breed parts: Ingestion — Airbyte, Fivetran, or Meltano move raw data into the warehouse. Transformation — dbt is the de facto standard for version-controlled, tested SQL models. Semantic layer — Cube.dev or dbt's own semantic layer encodes business definitions so queries stay consistent. Warehouse / lakehouse — BigQuery, Snowflake, Databricks, or an open table format like Iceberg or Delta. Version control + CI/CD — Git plus a pipeline that runs tests on every change. This is what makes the PR-review safety model possible. The agent — a coding agent like Claude Code a data engineering agent once it's pointed at your warehouse , connected to your data and tools through MCP, that actually writes and runs the work. The agent is the smallest piece. The other five are the harness — and they're what separate a fun demo from something you'd trust with a board number. Tools that bundle these into one governed workflow RevOS pre-wires dbt, Cube.dev, BigQuery, and Git behind a Claude Code agent save you from stitching them together yourself, but the architecture is the same whether you buy it or build it. Where agentic data engineering is headed where-agentic-data-engineering-is-headed The category is young and moving fast, and the direction is clear: data engineering is going AI-native — pipelines built around agents from the first commit rather than retrofitted onto a hand-built stack. A few specifics worth watching: More backends and open formats. Iceberg and Delta are pulling the warehouse and the lake together, which gives agents one consistent surface to work against. Broader warehouse support. Today many agentic tools are warehouse-specific; expect coverage to widen across Snowflake, Databricks, and Microsoft Fabric. Open connectivity and semantic standards. Frameworks like Meltano for ingestion, and competing semantic-model languages, are converging toward portability — so your definitions aren't locked to one vendor. Agents querying agents. As MCP matures, the consumer of your data layer is increasingly another AI agent, not a human in a dashboard — which raises the bar on governance, because the semantic layer is now the contract between machines. The throughline: the model gets cheaper and smarter every quarter, but the value keeps moving to the harness around it. Whoever owns clean, governed, well-defined data wins — which is why we've argued the real AI advantage is your data, not the model /blog/generative-ai-dilemma-opportunity-vs-monopoly . Getting started getting-started You don't need to adopt the whole category at once. The fastest way to feel the difference is to put an agent on a stack it already understands and ask for one real metric. The reproducible version with RevOS: install the CLI and run revos init . You land in a working project preloaded with sample datasets and models, so there's something real to point your agent at from minute one. Ask it to explore the sample lakehouse and build your first model — from install to a production-ready model is an afternoon, not a quarter. The full set of agent capabilities, skills, and interfaces is on the free plan up to 1 GB of storage , so trying it costs you an afternoon and nothing more. npm i -g @revos/cli revos init The zero-to-scheduled-action walkthrough lives at cli.revos.dev https://cli.revos.dev — try it out today. Frequently asked questions - What is agentic data engineering? - Agentic data engineering is the practice of using autonomous AI agents to design, build, and maintain data pipelines from natural-language intent — instead of an engineer writing every transformation by hand. The agent plans the work, writes the code ingestion, SQL, tests , runs it, checks the result, corrects itself, and a human approves the change. - How is agentic data engineering different from traditional data engineering? - Traditional data engineering is task-driven: a person writes each pipeline, query, and test step by step. Agentic data engineering is intent-driven and autonomous: you describe the outcome you want, an AI agent plans and executes the steps and validates its own work with limited human oversight, and you review the result. The shift is from doing the 'how' by hand to specifying the 'what' and approving the diff. - What's the difference between an AI copilot and an AI data agent? - A copilot suggests code one line at a time while you stay in the driver's seat. An agent works toward a goal across many steps on its own — it explores your data, writes a model, runs tests, reads the errors, and fixes them in a loop. A copilot autocompletes; an agent completes a task. - Will AI agents replace data engineers? - No — they change the job. Agents handle the repetitive build work boilerplate models, tests, documentation , while engineers move up to defining metrics and intent, reviewing the agent's pull requests, and governing what it's allowed to touch. The scarce skill becomes judgment about what is correct, not typing SQL faster. - How accurate is AI at writing SQL? - A general-purpose model writing SQL from a raw schema is unreliable — GPT-4o scored about 51% on Snowflake's internal text-to-SQL evaluation. Grounding the same task in a governed semantic model pushed accuracy past 90%. The difference between a wrong query and a right one is almost entirely context, not the model's raw capability. - What tools power agentic data engineering? - A typical stack pairs automated ingestion Airbyte, Meltano , a transformation framework dbt , a semantic layer Cube.dev , a warehouse BigQuery, Snowflake, Databricks , version control and CI/CD Git , and a coding agent Claude Code connected to your data through MCP. The agent writes and runs the transformations; the rest is the harness that keeps its output correct and safe. - Is it safe to let an AI agent work on production data? - It is safe when autonomy is earned, not granted by default. Permissions should be scoped and time-bound, every action should land in an immutable audit log, and mutating actions in production — schema changes, deletions, permission grants — should keep a human in the loop. Changes arriving as a reviewable pull-request diff mean mistakes surface at review time, not in a live dashboard. - Do AI data agents need a semantic layer? - They work far better with one. A semantic layer encodes your business definitions — what 'active user', 'revenue', or 'churn' actually mean — so the agent doesn't have to guess them from raw column names. It's the single biggest lever on accuracy: the same model that guesses against a raw schema becomes reliable when it queries a governed semantic model. - How do I get started with agentic data engineering? - Start small on a stack the agent already understands. With RevOS you install the CLI, run revos init, and land in a working project preloaded with sample data, dbt models, and a semantic layer — then point a coding agent at it and ask for your first metric. From install to a production-ready model is an afternoon, and the free plan up to 1 GB makes trying it cost nothing but time. Read more about revenue operations, growth strategies, and metrics in our blog /blog and follow us on LinkedIn https://www.linkedin.com/company/revos-ai/ and Youtube https://www.youtube.com/channel/UCqoOOxonDfeNEXpqoTuSSLQ . All articles /blog