# The First QA Checklist I Would Run On Any AI-Built App In 2026

> Source: <https://dev.to/marcusykim/the-first-qa-checklist-i-would-run-on-any-ai-built-app-in-2026-3p3g>
> Published: 2026-06-20 17:30:00+00:00

The most dangerous moment in an AI-built app is not when the app is obviously broken.

That part is annoying, but at least it is honest.

The dangerous moment is when the app looks done.

The buttons are there. The screens load. The AI says it fixed the bug. You click around for thirty seconds and nothing immediately catches fire.

Then your brain starts whispering the sweetest lie in software:

"We are probably good."

Probably good is not a QA strategy.

It is a tiny trap door with nice lighting.

If you are a beginner building with AI, you need a first QA pass that is simple enough to actually run. Not an enterprise test plan. Not a 400-line spreadsheet created by someone who uses the phrase "quality gate" recreationally.

Just a practical checklist that proves the app can survive normal use, bad input, empty data, account boundaries, and the bugs AI tends to hide while it is "fixing" something else.

The point is not to become a full-time QA engineer before you ship version one.

The point is to stop treating "the AI said it works" as evidence.

When I use AI coding tools, I do not trust the app because the code looks busy.

I trust it more when I can walk through a real user workflow and see the app behave correctly from beginning to end.

That distinction matters.

Beginners often test randomly. They click a few screens, refresh, maybe create one record, and then call the app done because the visible parts seem alive.

But a useful app is not a pile of screens.

A useful app is a workflow.

Someone has a goal. They enter information. The app saves it. The app shows it back. The app handles mistakes. The app protects the wrong person from seeing the wrong thing. The app does not collapse when the user does something slightly inconvenient, like leaving a field blank or using the back button.

That is what you are testing.

If you are still at the stage where you have an app idea but do not know what to ask AI before building or testing it, I made a free AI App Builder Starter Prompts pack for beginners. It helps you turn a rough app idea into a scoped first build with AI:

[https://marcusykim.gumroad.com/l/ai-app-builder-starter-prompts](https://marcusykim.gumroad.com/l/ai-app-builder-starter-prompts)

For QA, the useful question is not:

"Does the app look done?"

The useful question is:

"Can the user complete the promised workflow without me standing over their shoulder explaining the weird parts?"

That is the first pass.

Before touching the app, write one sentence:

```
A user should be able to [do one specific job] so they can [get one specific result].
```

Examples:

This sentence becomes your QA target.

Without it, you will test the app like you are wandering through a furniture store.

"This button works. This page exists. This dropdown opens. This couch is somehow on sale."

Everything looks like progress, but you are not proving the thing that matters.

You are proving that separate parts exist.

QA starts when the parts are forced to work together.

The happy path is the normal path.

It is what should happen when the user does everything correctly.

For a simple app, the happy path might be:

Do not rush this.

AI-built apps can fail in boring places. The save button looks like it worked, but the data never persisted. The edit screen opens, but it is editing the wrong field. The delete button removes the item visually, but it comes back after refresh like a software boomerang.

Run the happy path slowly and write down every expectation before you test it.

Good QA sounds boring:

```
When I create a note called "Practice riff" and save it, I should see it in the notes list after refreshing the page.
```

That sentence is stronger than "check notes feature."

"Check notes feature" is a fog machine.

Specific expected behavior is a flashlight.

One beginner mistake is trusting the screen state too much.

Modern apps can make something look saved before it is actually saved where it needs to live.

So refresh.

Refresh after creating data.

Refresh after editing data.

Log out and log back in.

Close the tab and reopen it.

If it is a mobile app, kill and reopen the app.

You are checking whether the app has real persistence or just temporary optimism.

This matters especially with AI-generated code because the tool may wire up a beautiful interface before the data flow is truly correct.

The screen can be convincing while the backend is silently shrugging.

I want to know:

If the answer is unclear, the app is not done.

It is wearing a done costume.

Users do not lovingly fill out every field like they are completing a sacred ritual.

They skip things.

They paste weird text.

They submit forms early.

They type one letter and then get distracted by a microwave beep or the sudden realization that they forgot to respond to an email from four days ago.

Your app needs to handle this.

For every important form, test:

You are not trying to be dramatic.

You are trying to find the places where the app assumes the user behaves like the developer.

That assumption is usually false.

AI-generated code can be especially optimistic here. It may build the form, connect the button, and skip the boring validation rules unless you explicitly ask for them.

Ask for them.

Then test them yourself.

An empty state is what the app shows when there is no data yet.

Beginners forget this constantly because they build while staring at fake sample data.

The app looks great when it has five beautiful placeholder records.

Then a real new user signs in and sees a blank void with a navigation bar.

That is not a first impression.

That is a small abandoned warehouse.

For each main screen, ask:

A good empty state does not need to be clever.

It just needs to answer:

"What now?"

If your app cannot answer that, the user has to guess.

Guessing is friction.

Friction is where beginner apps quietly lose people.

If your app has accounts, this part is not optional.

You need at least two test accounts.

Not one.

Two.

Account A should not see Account B's private data.

Account B should not edit Account A's records.

Logged-out users should not reach private screens just by typing a URL.

This is where "it works on my machine" becomes dangerous, because your machine is probably logged in as the same test user all the time.

Create two accounts and test:

Do not rely on vibes here.

Authentication is the bouncer at the door.

If the bouncer is asleep, the furniture arrangement does not matter.

When something breaks, the beginner instinct is to paste the error into AI and say:

"Fix this."

That can work.

It can also create a chain of tiny surgical fixes that solve the visible symptom while quietly damaging the larger workflow.

I prefer to slow the tool down.

Before asking for a fix, ask for a test plan.

Use a prompt like this:

```
I am testing this workflow:
[describe the workflow]

The bug I saw is:
[describe the bug]

Before changing code, explain:
1. the likely causes
2. which files or areas might be involved
3. what behavior should be true after the fix
4. what regression checks I should run
5. the smallest safe fix you recommend

Do not write code yet. Give me the plan first.
```

This is one of the reasons I include QA and debugging prompts in the free AI App Builder Starter Prompts pack. The useful move is not asking AI to magically fix everything. The useful move is forcing it to name the expected behavior before it changes the project:

[https://marcusykim.gumroad.com/l/ai-app-builder-starter-prompts](https://marcusykim.gumroad.com/l/ai-app-builder-starter-prompts)

That step protects you from the AI tool becoming a very confident racetrack for random patches.

Planning before fixing sounds slower.

In practice, it often saves time because you stop turning one bug into three new bugs wearing different hats.

This is the regression pass.

Regression is when a new change breaks something that used to work.

AI can do this very easily because it is often focused on the local task you just gave it.

It fixes the signup bug, but now profile editing breaks.

It fixes the save button, but now the list does not refresh.

It fixes the mobile layout, but now the desktop layout looks like someone folded the page in half and sat on it.

After every meaningful fix, re-test:

This is not glamorous.

It is also where a lot of real software quality lives.

The question is not:

"Did AI fix the bug?"

The question is:

"Did the app still keep its promises after the fix?"

You do not need to simulate every disaster for version one.

But you should at least check what happens when loading is slow or data is missing.

Beginner apps often fail because they assume everything appears instantly.

Then the real world arrives with a weak connection, a slow request, a failed upload, or a backend rule that rejects something.

Check:

Even one pass here can reveal a lot.

If your app shows nothing while loading, users may click again.

If your app shows a technical error, users may leave.

If your app silently fails, users may think they did something wrong.

Silence is not a user experience.

It is a mystery novel with no ending.

Before you call the app done, write a done-when line.

```
This workflow is done when a new user can create an account, create one project, edit it, see it after refresh, delete it, and confirm another account cannot access it.
```

That is much stronger than:

```
Project screen finished.
```

Finished according to whom?

Finished under what conditions?

Finished until which user clicks which cursed button?

A done-when line gives you a finish line you can test.

It also gives AI a better target. If you tell the tool exactly what done means, it has less room to wander into cosmetic improvements, extra features, or random refactors that do not help the first version.

This is one of the main lessons I keep learning in freelance work and AI-assisted development:

Clear definitions beat heroic effort.

The app does not care how hard you worked.

The user does not care how many files changed.

The workflow either works or it does not.

If I had to turn this into a simple checklist, I would run this:

```
1. Write the one user workflow.
2. Run the happy path slowly.
3. Refresh after create, edit, and delete.
4. Test empty and invalid inputs.
5. Check brand-new empty states.
6. Use two accounts and test boundaries.
7. Ask AI for a test plan before bug fixes.
8. Re-test old behavior after every fix.
9. Check one loading or error state.
10. Write the done-when line in plain English.
```

That is not a complete QA department.

It is a first pass.

But a first pass is much better than the usual beginner pattern:

Click around, feel hopeful, ship, panic.

AI can help you build faster.

It can also help you create a mess faster.

QA is how you slow the project down just enough to keep control.

The practical takeaway:

Do not ask, "Does the app look done?"

Ask, "Can the user complete the promised workflow, recover from mistakes, and trust the data afterward?"

That question will make your AI-built app better immediately.

I made a free AI App Builder Starter Prompts pack for beginners who want to turn a rough app idea into a scoped first build with AI:

[https://marcusykim.gumroad.com/l/ai-app-builder-starter-prompts](https://marcusykim.gumroad.com/l/ai-app-builder-starter-prompts)

If you want the full build-along field manual behind the free prompts, AI App Builder From Zero walks through idea, scope, stack, prompting, QA, deployment, and launch:

[https://marcusykim.gumroad.com/l/ai-app-builder-from-zero](https://marcusykim.gumroad.com/l/ai-app-builder-from-zero)

Medium: [https://medium.com/@contact_30652](https://medium.com/@contact_30652)

DEV.to: [https://dev.to/marcusykim](https://dev.to/marcusykim)

X: [https://x.com/contact_30652](https://x.com/contact_30652)

LinkedIn: [https://www.linkedin.com/in/marcusykim/](https://www.linkedin.com/in/marcusykim/)