The Role of QA in the New AI SDLC

wpnews.pro

QA’s role in the new AI SDLC is no longer just “test the finished application.”

It is becoming quality engineering across the entire lifecycle:

The big shift is this:

Old SDLC QA:Does the software meet the requirements?

New AI SDLC QA:Can we trust the system, the AI-generated work, the data, the model behavior, and the delivery process — repeatedly, safely, and measurably?

AI does not eliminate QA.

It makes strong QA leadership more important.

For a first pass on dev.to, I would use a simple text diagram rather than Mermaid. It is safer for copy/paste into the dev.to/new editor and avoids renderer surprises.

Business Need / Product Idea
        ↓
Requirements + Risk Definition
        ↓
Spec-Driven Development
        ↓
Prompt / Agent / Workflow Design
        ↓
AI-Assisted Code + Test Generation
        ↓
Human Review + Automated Testing
        ↓
CI/CD Quality Gates
        ↓
Deployment
        ↓
Production Monitoring
        ↓
Feedback, Drift, Incidents, Metrics
        ↺ loops back into Requirements + Risk Definition

QA is not sitting at the end of this flow.

QA influences the entire loop:

QA / Quality Engineering
        ↳ Requirements
        ↳ Specs
        ↳ Prompts and agents
        ↳ Generated code
        ↳ Automated tests
        ↳ CI/CD quality gates
        ↳ Production monitoring
        ↳ Feedback and improvement
        ↳ Governance and audit evidence

QA should be involved before code exists.

For AI-enabled systems, requirements need to include not just functional behavior, but also risk, trust, and guardrails.

QA helps define:

This is one of the most important changes in the AI SDLC.

QA cannot wait until the end of the process and then try to test quality into the system. The quality strategy has to start at the beginning.

In an AI SDLC, the specification becomes more important, not less.

If AI agents or copilots are generating code, tests, documentation, or workflows, then QA needs to help make the specification precise enough that AI can generate useful output.

QA should push for:

A useful traceability chain looks like this:

Requirement → Prompt/Spec → Generated Code → Tests → Evidence

This is where QA becomes a system designer of correctness, not just a defect finder.

Many engineering teams are now using tools like Claude Code, GitHub Copilot, Cursor, ChatGPT, and internal AI agents to generate or modify software artifacts.

That means QA also needs to help test the prompts, skills, conventions, and workflows themselves.

QA should validate whether AI workflows:

For AI QE Architects, this is a major opportunity.

A strong QA function can create reusable prompts, skills, conventions, documentation, and evaluation checks so teams generate better software and better tests consistently.

AI can generate a lot of tests quickly.

That is useful.

It is also risky if nobody checks whether those tests are meaningful.

QA’s role is to make sure AI-generated tests are:

The trap is believing this:

More tests automatically means better quality.

It does not.

QA needs to guard against shallow, duplicated, brittle, or misleading AI-generated tests.

The goal is not just volume. The goal is useful coverage, meaningful validation, and trustworthy release evidence.

For systems using machine learning, large language models, recommendations, classification, scoring, summarization, or prediction, QA now has to care about data and model behavior too.

That includes:

Traditional software tests usually ask whether the code follows deterministic rules.

AI systems often require a broader question:

Is the behavior acceptable, safe, and reliable across the kinds of real-world inputs the system will receive?

That requires evaluation strategy, monitoring, and human judgment.

QA should help define automated gates that prevent bad AI-generated or AI-enabled changes from reaching production.

Examples include:

The goal is not to slow everyone down.

The goal is to make fast delivery safe.

This is especially important when AI increases the speed at which teams can produce code.

Faster generation without stronger quality gates simply accelerates risk.

AI systems can degrade after release because the world around them changes.

Things that can change include:

QA therefore needs to stay involved after release through:

This is one of the biggest mindset shifts:

Production becomes part of the test strategy.

In the AI SDLC, testing does not stop at deployment.

Production behavior becomes a source of quality information that feeds back into requirements, specs, tests, prompts, and governance.

AI creates a new need for evidence.

QA can own or strongly influence the evidence trail.

That means documenting:

This matters in regulated environments, but it also matters for any company trying to use AI responsibly.

Governance is not just paperwork.

Good governance helps teams prove that they understood the risks, tested the right things, and made informed release decisions.

In the AI SDLC, QA becomes less about manual validation at the end and more about designing a trustworthy delivery system.

Area	QA / QE Responsibility
Product idea	Identify quality risks early
Requirements	Make requirements testable, measurable, and risk-aware
Specs	Add examples, counterexamples, edge cases, and acceptance criteria
Prompts / agents	Validate consistency, correctness, guardrails, and failure modes
Generated code	Review AI-generated code for correctness, maintainability, and standards
Test automation	Generate, review, scale, and govern automated tests
Data / model quality	Validate datasets, model behavior, drift, and evaluation metrics
CI/CD	Build quality gates into pipelines
Deployment	Require release evidence before production
Production	Monitor quality after release
Governance	Preserve traceability, audit evidence, approvals, and known limitations

Requirements → Code → Test → Release

Traditional QA often enters late and asks:

Does the software meet the requirements?

Risk → Spec → Prompt → Generated Code → Test → Gate → Monitor → Improve

AI SDLC QA enters early and keeps asking:

How do we know this is correct, safe, maintainable, observable, and fit for purpose?

QA is becoming the group that answers:

How do we know this AI-assisted system is correct, safe, maintainable, observable, and fit for purpose?

That is a much bigger role than traditional testing.

It is also a huge opportunity for experienced QA architects, because AI makes weak engineering processes worse and strong engineering processes faster.

QA’s job is to make sure the organization gets the second outcome, not the first.

In the new AI SDLC, QA is not just testing software.

QA is helping the organization build systems that are:

AI does not replace QA. AI makes strong QA leadership more important.

These references are useful for grounding this model of QA in the AI SDLC.

NIST provides a practical framework for thinking about AI risk through governance, mapping, measurement, and management.

Useful for supporting the role of QA in risk definition, measurement, monitoring, governance, and lifecycle accountability.

https://nvlpubs.nist.gov/nistpubs/ai/nist.ai.100-1.pdf

Google Cloud’s MLOps guidance explains why machine learning systems require CI/CD, continuous training, automation, monitoring, and production feedback loops.

Useful for supporting the idea that AI quality is not a one-time testing event.

This guide provides a broader view of operationalizing ML systems, including lifecycle practices, automation, monitoring, and production readiness.

Useful for grounding QA’s role in end-to-end ML system quality.

https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf

Microsoft’s Responsible AI Standard provides concrete requirements for building AI systems responsibly.

Useful for supporting governance, accountability, transparency, reliability, safety, fairness, privacy, and inclusive design considerations.

OWASP identifies major security risks for LLM applications, including prompt injection, insecure output handling, training data poisoning, sensitive information disclosure, and supply-chain vulnerabilities.

Useful for supporting QA involvement in LLM-specific security and quality risks.

https://genai.owasp.org/llm-top-10/

ISO/IEC 42001 defines an AI management system standard for organizations that develop, provide, or use AI systems.

Useful for supporting auditability, governance, accountability, lifecycle management, and continuous improvement.

source & further reading

dev.to — original article Model Kombat: The LLM Fighting Game! Your Claude Code May Be Silently Approving Permissions — Here's How to Check Claude Code Is Crashing Your Frame Rate on Windows 11 — The Fix

The Role of QA in the New AI SDLC

Run your AI side-project on zahid.host