Stop checking AI-generated code. Start generating less of it

A 2026 survey of over 1,100 developers by Sonar found that 42% of committed code is now AI-assisted, with roughly 29% of that code merged without any manual review. The industry's reliance on post-generation validation tools like static analyzers and security scanners creates a reactive posture where costs scale linearly with code volume, rather than reducing the amount of code that requires guardrails in the first place.

According to Sonar’s State of Code Developer Survey report for 2026 https://www.sonarsource.com/state-of-code-developer-survey-report.pdf , based on a survey of over 1,100 developers, 42% of committed code is now AI-assisted, and roughly 29% of it gets merged without manual review. Not “light review.” No review at all. The industry’s response has been predictable: more guardrails. Static analysis. Token linting. Visual regression testing. Accessibility audits. Security scans. Each tool is a reasonable reaction to a real failure mode. Taken together, though, they describe something uncomfortable: a system permanently compensating for its own unreliability. The AI generates. The tooling checks. The developers arbitrate. And the whole apparatus scales linearly with the volume of code being produced. That is the wrong scaling curve for any enterprise that plans to build more than a handful of applications. The conventional framing — “How do we build better guardrails for AI-generated code?” — is not wrong. In my opinion, it is just incomplete. The more productive question should be, “How do we reduce the amount of code that needs guardrails in the first place?” That question leads us to a fundamentally different architecture, one that thoughtfully applies AI on an escalating curve from zero to partial to full code generation. One I call the AI assembly model. First, let’s take a deeper look at how things work today. When a generative AI https://www.infoworld.com/article/2338115/what-is-generative-ai-artificial-intelligence-that-creates.html tool produces a UI component from scratch — a data table, a form, a navigation bar — the output is probabilistic. It might be correct. It might also carry a missing authentication check, a hardcoded color value that bypasses the design system, broken accessibility markup, or a state management pattern that collapses under concurrent load. You will not know until you inspect it. And inspection, at enterprise scale, is expensive. So, the industry layers on post-generation validation. A static analyzer catches potential injection vectors. A linter flags design token drift. A visual regression suite compares the rendered component against a baseline. An accessibility scanner checks ARIA https://developer.mozilla.org/en-US/docs/Web/Accessibility/ARIA roles and contrast ratios. A DAST https://owasp.org/www-project-devsecops-guideline/latest/02b-Dynamic-Application-Security-Testing tool probes the running application for OWASP Top 10 https://owasp.org/www-project-top-ten/ vulnerabilities. Each of these tools addresses a genuine risk. None of them prevents the risk from occurring. They detect it after the fact. This is a reactive posture, and it has a structural cost problem. Every new application built on a generate-first model requires the full battery of checks to run again. Every component generated from a prompt is a fresh surface for every category of defect. Double the number of apps, and you double the audit burden. Triple them, and you triple it. There is no compounding advantage. Each generation event starts from zero. For a team shipping one experimental chatbot, that cost is manageable. For an enterprise program building dozens of internal applications across regulated business lines, it becomes the dominant line item in the development life cycle—not in compute costs, but in developer hours spent diagnosing wrong output, QA cycles catching regressions, and production incidents when defects slip through. The AI assembly model starts from a different premise. The most reliable code is code that was never generated on demand. Instead of prompting a large language model https://www.infoworld.com/article/2335213/large-language-models-the-foundations-of-generative-ai.html LLM to write a component from scratch every time, the assembly model maps developer intent — whether expressed through a natural-language prompt, a visual canvas interaction, or a Figma import — to a pre-built, tested, certified component from an enterprise library. The AI’s job is not to write the component. It is to select the right component and configure it. This is a meaningful architectural distinction, not a marketing one. The assembly model operates across three tiers of generation, each with a different risk profile. The guardrail, in this model, is not a check that fires after generation. It is the routing rule that sends developer intent to a pre-built artifact instead of a generative model. If the library has the answer, generation never starts. When it does start, it is scoped precisely to the gap that triggered it. The assembly model works only if the components in the library are genuinely certified artifacts, not just reusable snippets. Quality must be a property of the component itself, not something the consuming application is responsible for verifying. That means each component in the enterprise library must carry binding guarantees across several dimensions. The front-end component story is compelling, but the harder problem — and the higher-stakes one — lives in back-end services. Persistence layers, API endpoints, security filters, service integrations — this is where the most code gets generated in a typical enterprise application, and where architectural mistakes are most consequential. The AI assembly model handles this by embedding architectural guardrails as structural properties of every generated service — not as optional patterns that developers must remember to follow, but as invariants that the platform enforces. The distinction matters. A pattern that developers can forget to apply is a pattern that will be forgotten, especially under the time pressure that AI-assisted velocity creates. Six back-end guardrails, in particular, define the difference between code that merely compiles and code that can safely run a regulated business. None of these are novel ideas in isolation. Twelve-factor apps, OWASP compliance, externalized secrets, end-to-end RBAC — these are well-understood engineering principles. What is novel is making them structural properties of a code generation architecture rather than aspirational items on a checklist. When these guardrails are architectural invariants, they do not depend on developer discipline. They do not erode under deadline pressure. They do not vary between teams. The AI assembly model is not free of trade-offs. It carries a higher context overhead than a bare generative approach. Teaching the system your component library schema, your design token bindings, your architectural constraints — all of this consumes tokens before the first line of useful output is produced. A naive comparison of per-session token cost will favor the generate-first model. But that comparison is misleading, because it ignores where the real costs accumulate. In a generate-first model, every component is produced in full, every time. Each generation run burns tokens on implementation code that already exists in a tested form somewhere in the organization’s component library, if only the model knew to use it. Self-correction loops are frequent, because probabilistic output regularly misses the target on the first pass. And every generated component requires the full audit cycle: security, accessibility, visual regression, functional testing. In the assembly model, the component code already exists. The AI configures rather than constructs. A fraction of the tokens. A fraction of the self-correction loops. A fraction of the output requiring validation. The context overhead is paid once per session. The generation savings compound across every component assembled. And they compound again with every additional application built on the same library. The real advantage, though, is not in token economics. It is in defect cost. Fewer developer hours spent diagnosing incorrect AI output. Fewer QA cycles spent catching regressions that a generate-first model produces stochastically. Fewer production incidents when defects evade the guardrail stack entirely. A pre-built, certified component absorbs those costs once, at build time. Every application that uses it inherits the savings. That is a compounding return on quality investment — the opposite of the linear cost growth that characterizes generate-then-check. For enterprises operating in regulated industries, such as financial services, health care, government, and insurance, the compliance implications of the assembly model deserve separate attention. A generate-first model produces a compliance artifact that says, in essence: “We generated this code, and then we tested it, and the tests passed.” That is a valid compliance posture. It is also a fragile one. It depends on the completeness of the test suite, the rigor of the review process, and the assumption that every generation run will be subjected to the same standard of scrutiny. Given that 29% of AI-assisted code is already merging without review, that assumption is under visible strain. The assembly model produces a different artifact: “This application was assembled from components that were certified at build time against these specific standards. Only the custom-generated portions required runtime validation.” The certified-by-construction approach reduces the compliance surface to the genuinely novel code — the business logic and integrations that no library component could satisfy. Everything else carries its compliance evidence with it, embedded in the component’s certification history. This is not a theoretical distinction. It changes the conversation with auditors, with regulators, and with the internal risk committee. It shifts compliance from a per-release testing exercise to a structural property of the development platform. And it scales: the hundredth application built on a certified library faces the same compliance burden as the first, not a hundred times the burden. The AI code generation debate, as currently framed, asks the wrong question. “How do we add better guardrails to AI-generated code?” is a question that accepts the premise of generate everything then check everything. It leads to an arms race between generation volume and validation tooling — an arms race where the volume is growing at 42% of committed code and rising, and the tooling is perpetually one defect category behind. The AI assembly model reframes the question. Not “how do we check more effectively?” but “how do we generate less in the first place?” Not “how do we catch defects downstream?” but “how do we make defects structurally impossible for the assembled portion of the application?” Guardrails are necessary. They will remain necessary for every line of code that AI genuinely generates. The argument here is not against guardrails. It is against a model where guardrails are the primary quality mechanism for an entire application, including the 70% or 80% of it that could have been assembled from certified parts. The teams that figure this out first will not just ship faster. They will ship with a quality profile that generate-first teams cannot match without proportionally scaling their validation infrastructure — which is to say, without giving back most of the velocity gains that AI-assisted development was supposed to deliver. — New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug dineley@foundryco.com .