"AI doesn't replace QA engineers. It gives them superpowers."
Seven parts. One framework. Built from scratch.
POM-based UI tests. Network interception. Multi-user contexts. A full API testing layer with typed clients. Visual regression across four viewports. A complete debugging toolkit. A production CI/CD pipeline with sharding, Docker, and Slack notifications.
This is what we built.
Now we add AI on top of it.
Not as a replacement — as an amplifier.
The engineers who will define the next decade of QA are the ones who understand how to use AI tools deliberately — knowing exactly what they're good for, exactly where they fall short, and how to combine them with the solid engineering we've spent seven parts building.
That's what Part 8 is about.
Playwright MCP. AI test agents. Self-healing selectors.
Let's finish this. 🎯
After Part 7, our framework is complete and production-ready:
playwright-playbook/
├── .github/workflows/
│ ├── playwright.yml ✅ Part 7
│ └── playwright-visual.yml ✅ Part 7
├── tests/
│ ├── auth/login.spec.ts ✅ Part 1
│ ├── tasks/task-management.spec.ts ✅ Part 1
│ ├── network/ ✅ Part 2
│ ├── multi-user/ ✅ Part 3
│ ├── multi-tab/ ✅ Part 3
│ ├── api/ ✅ Part 4
│ ├── visual/ ✅ Part 5
│ └── debug/ ✅ Part 6
├── pages/
│ ├── LoginPage.ts ✅ Part 1
│ ├── TaskPage.ts ✅ Part 1
│ └── DashboardPage.ts ✅ Part 3
├── api/
│ ├── TaskApiClient.ts ✅ Part 4
│ └── AuthApiClient.ts ✅ Part 4
├── fixtures/
│ ├── auth.fixture.ts ✅ Part 1
│ ├── tasks.json ✅ Part 2
│ ├── empty-tasks.json ✅ Part 2
│ ├── tasks-har.har ✅ Part 2
│ ├── multi-user.fixture.ts ✅ Part 3
│ └── api.fixture.ts ✅ Part 4
├── scripts/
│ ├── record-har.ts ✅ Part 2
│ └── notify-slack.ts ✅ Part 7
├── utils/
│ ├── schema-validator.ts ✅ Part 4
│ ├── visual-helpers.ts ✅ Part 5
│ └── debug-helpers.ts ✅ Part 6
├── docker/ ✅ Part 7
├── snapshots/ ✅ Part 5
├── .vscode/ ✅ Part 6
├── global-setup.ts ✅ Part 1
├── playwright.config.ts ✅ Parts 1–7
├── .gitignore ✅ Part 7
└── .env
By the end of Part 8, we add:
playwright-playbook/
├── tests/
│ └── ai/ ← NEW
│ └── ai-generated.spec.ts
├── ai/ ← NEW
│ ├── TestPlanner.ts
│ └── TestHealer.ts
├── scripts/
│ ├── generate-test-plan.ts ← NEW
│ └── heal-selectors.ts ← NEW
└── .mcp.json ← NEW
Every file gets fully built below. 👇
Before we write code, be clear on what we're actually dealing with. There are three separate AI capabilities in the Playwright ecosystem, and they do very different things:
Tool What it does Status
────────────────────── ─────────────────────────────────── ──────────────
Playwright MCP Exposes Playwright as a tool for Available now
AI assistants (Claude, Copilot) (v1.47+)
to control a real browser
AI Test Agents Playwright's built-in agents that Experimental
(Planner/Generator) explore your app and generate (v1.47+ with flag)
test skeletons automatically
Self-Healing Automatically patches broken Custom (we build
locators when selectors change this in Part 8)
We'll build proper, practical implementations of all three — and be honest about where each one earns its place. 🎯
MCP stands for Model Context Protocol — an open standard for connecting AI assistants to external tools. Playwright's MCP server lets an AI assistant (Claude, GitHub Copilot, Cursor) directly control a real Chromium browser.
This means you can say to Claude or Copilot: "Navigate to the task creation flow and write me a Playwright test for it." And it will literally open a browser, click around, inspect the DOM, and generate real test code.
Install the MCP server:
npm install -D @playwright/mcp
Configure it for your project:
// .mcp.json — MCP configuration for your project
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": [
"@playwright/mcp",
"--browser", "chromium",
"--base-url", "http://localhost:3000",
"--output-dir", "tests/ai"
],
"env": {
"PLAYWRIGHT_STORAGE_STATE": ".auth/admin.json"
}
}
}
}
For VS Code with GitHub Copilot, add to .vscode/settings.json
:
{
"mcp": {
"servers": {
"playwright": {
"command": "npx",
"args": ["@playwright/mcp", "--browser", "chromium"],
"env": {
"BASE_URL": "http://localhost:3000"
}
}
}
}
}
With MCP configured, your AI assistant can:
You say:
"Write a test for the task deletion flow including the confirmation dialog"
AI does:
1. Opens Chromium via MCP
2. Navigates to http://localhost:3000/tasks
3. Inspects the DOM — finds task items, hover targets, delete buttons
4. Identifies the confirmation dialog structure
5. Generates a TypeScript test using getByRole/getByTestId locators
6. Writes it to tests/ai/ directory
You get:
A real, runnable test file — not a hallucinated one
The key difference from AI generating test code from a description: MCP sees the actual DOM. It generates locators that actually exist, not ones it imagines. 🔥
Once MCP is configured in Claude Desktop or Claude.ai:
Prompt: "I have a Playwright project at http://localhost:3000.
Navigate to the task management page, explore the UI,
and write a complete TypeScript Playwright test for creating,
editing, and deleting a task. Use our TaskPage POM from pages/TaskPage.ts."
Claude will:
→ Open browser via MCP
→ Navigate and inspect the real UI
→ Generate tests that match actual element structure
→ Reference your existing POM correctly
The Test Planner uses AI to analyse your existing test suite and your app's DOM structure, then suggests missing test scenarios.
// ai/TestPlanner.ts
import * as fs from 'fs';
import * as path from 'path';
import { chromium } from '@playwright/test';
export interface TestScenario {
feature: string;
scenario: string;
priority: 'high' | 'medium' | 'low';
covered: boolean;
suggestedTestFile: string;
}
export interface TestPlan {
url: string;
generatedAt: string;
totalScenarios: number;
coveredCount: number;
uncoveredCount: number;
scenarios: TestScenario[];
}
export class TestPlanner {
private readonly baseUrl: string;
private readonly testsDir: string;
constructor(baseUrl: string, testsDir: string = './tests') {
this.baseUrl = baseUrl;
this.testsDir = testsDir;
}
/**
* Crawl the app and extract all interactive elements.
* Returns a structured map of page → interactive elements.
*/
async crawlApp(
storageState?: string
): Promise<Map<string, string[]>> {
const browser = await chromium.launch();
const context = storageState
? await browser.newContext({ storageState })
: await browser.newContext();
const page = await context.newPage();
const pageMap = new Map<string, string[]>();
const pagesToCrawl = [
'/login',
'/dashboard',
'/tasks',
'/admin',
];
for (const route of pagesToCrawl) {
try {
await page.goto(`${this.baseUrl}${route}`, {
waitUntil: 'networkidle',
timeout: 10000,
});
// Extract all interactive elements and their accessible names
const elements = await page.evaluate(() => {
const interactable: string[] = [];
// Buttons
document.querySelectorAll('button').forEach(btn => {
const name = btn.getAttribute('aria-label') ||
btn.textContent?.trim() ||
btn.getAttribute('data-testid') || '';
if (name) interactable.push(`button: ${name}`);
});
// Links
document.querySelectorAll('a[href]').forEach(link => {
const name = link.textContent?.trim() ||
link.getAttribute('aria-label') || '';
if (name) interactable.push(`link: ${name}`);
});
// Forms
document.querySelectorAll('form').forEach((form, i) => {
const inputs = form.querySelectorAll('input, textarea, select');
interactable.push(`form[${i}]: ${inputs.length} fields`);
});
// Modals/Dialogs
document.querySelectorAll('[role="dialog"]').forEach(dialog => {
const label = dialog.getAttribute('aria-label') ||
dialog.getAttribute('aria-labelledby') || 'dialog';
interactable.push(`dialog: ${label}`);
});
return interactable;
});
pageMap.set(route, elements);
} catch {
pageMap.set(route, ['[page not accessible]']);
}
}
await browser.close();
return pageMap;
}
/**
* Scan existing test files and extract test names.
* Used to check which scenarios are already covered.
*/
getExistingTestNames(): string[] {
const testNames: string[] = [];
const testGlob = this.testsDir;
const scanDir = (dir: string) => {
if (!fs.existsSync(dir)) return;
const entries = fs.readdirSync(dir, { withFileTypes: true });
for (const entry of entries) {
const fullPath = path.join(dir, entry.name);
if (entry.isDirectory()) {
scanDir(fullPath);
} else if (entry.name.endsWith('.spec.ts')) {
const content = fs.readFileSync(fullPath, 'utf-8');
const matches = content.matchAll(/test\(['"`](.+?)['"`]/g);
for (const match of matches) {
testNames.push(match[1]);
}
}
}
};
scanDir(testGlob);
return testNames;
}
/**
* Generate a test plan — what's covered, what's missing, priorities.
*/
async generatePlan(): Promise<TestPlan> {
const pageMap = await this.crawlApp('.auth/admin.json');
const existingTests = this.getExistingTestNames();
const scenarios: TestScenario[] = [
// Auth scenarios
{ feature: 'Authentication', scenario: 'successful login redirects to dashboard', priority: 'high', suggestedTestFile: 'tests/auth/login.spec.ts', covered: false },
{ feature: 'Authentication', scenario: 'invalid credentials show error message', priority: 'high', suggestedTestFile: 'tests/auth/login.spec.ts', covered: false },
{ feature: 'Authentication', scenario: 'empty fields show validation errors', priority: 'high', suggestedTestFile: 'tests/auth/login.spec.ts', covered: false },
{ feature: 'Authentication', scenario: 'logout clears session', priority: 'high', suggestedTestFile: 'tests/auth/login.spec.ts', covered: false },
// Task CRUD
{ feature: 'Task Management', scenario: 'user can create a new task', priority: 'high', suggestedTestFile: 'tests/tasks/task-management.spec.ts', covered: false },
{ feature: 'Task Management', scenario: 'user can edit an existing task', priority: 'high', suggestedTestFile: 'tests/tasks/task-management.spec.ts', covered: false },
{ feature: 'Task Management', scenario: 'user can delete a task', priority: 'high', suggestedTestFile: 'tests/tasks/task-management.spec.ts', covered: false },
{ feature: 'Task Management', scenario: 'task list shows correct count after creation', priority: 'medium', suggestedTestFile: 'tests/tasks/task-management.spec.ts', covered: false },
{ feature: 'Task Management', scenario: 'task creation with empty title is blocked', priority: 'medium', suggestedTestFile: 'tests/tasks/task-management.spec.ts', covered: false },
{ feature: 'Task Management', scenario: 'task status can be changed to completed', priority: 'medium', suggestedTestFile: 'tests/tasks/task-management.spec.ts', covered: false },
// Permissions
{ feature: 'Role Permissions', scenario: 'admin sees admin panel — regular user does not', priority: 'high', suggestedTestFile: 'tests/multi-user/role-permissions.spec.ts', covered: false },
{ feature: 'Role Permissions', scenario: 'admin can delete any task — user cannot', priority: 'high', suggestedTestFile: 'tests/multi-user/role-permissions.spec.ts', covered: false },
{ feature: 'Role Permissions', scenario: 'user accessing /admin is redirected', priority: 'high', suggestedTestFile: 'tests/multi-user/role-permissions.spec.ts', covered: false },
// Real-time
{ feature: 'Real-Time', scenario: 'task assigned by admin appears instantly for user', priority: 'medium', suggestedTestFile: 'tests/multi-user/realtime-collaboration.spec.ts', covered: false },
{ feature: 'Real-Time', scenario: 'notification appears when task assigned', priority: 'medium', suggestedTestFile: 'tests/multi-user/realtime-collaboration.spec.ts', covered: false },
// API
{ feature: 'API', scenario: 'GET /api/tasks returns 200 with valid schema', priority: 'high', suggestedTestFile: 'tests/api/tasks-api.spec.ts', covered: false },
{ feature: 'API', scenario: 'POST /api/tasks without auth returns 401', priority: 'high', suggestedTestFile: 'tests/api/tasks-api.spec.ts', covered: false },
{ feature: 'API', scenario: 'DELETE /api/tasks/:id returns 404 for non-existent', priority: 'medium', suggestedTestFile: 'tests/api/tasks-api.spec.ts', covered: false },
// Error handling
{ feature: 'Error Handling', scenario: 'shows error banner when API returns 500', priority: 'high', suggestedTestFile: 'tests/network/error-simulation.spec.ts', covered: false },
{ feature: 'Error Handling', scenario: 'shows retry button when network fails', priority: 'medium', suggestedTestFile: 'tests/network/error-simulation.spec.ts', covered: false },
{ feature: 'Error Handling', scenario: 'shows skeleton when API is slow', priority: 'low', suggestedTestFile: 'tests/network/error-simulation.spec.ts', covered: false },
];
// Match existing tests against scenarios
for (const scenario of scenarios) {
scenario.covered = existingTests.some(name =>
name.toLowerCase().includes(scenario.scenario.toLowerCase().slice(0, 30))
);
}
const coveredCount = scenarios.filter(s => s.covered).length;
return {
url: this.baseUrl,
generatedAt: new Date().toISOString(),
totalScenarios: scenarios.length,
coveredCount,
uncoveredCount: scenarios.length - coveredCount,
scenarios,
};
}
/**
* Write the test plan to a Markdown file.
*/
async writePlanMarkdown(outputPath: string): Promise<void> {
const plan = await this.generatePlan();
const coveragePercent = Math.round(
(plan.coveredCount / plan.totalScenarios) * 100
);
const lines: string[] = [
`# Test Plan — ${plan.url}`,
``,
`Generated: ${new Date(plan.generatedAt).toLocaleString()}`,
``,
`## Coverage Summary`,
``,
`| Total Scenarios | Covered | Uncovered | Coverage |`,
`|-----------------|---------|-----------|----------|`,
`| ${plan.totalScenarios} | ✅ ${plan.coveredCount} | ❌ ${plan.uncoveredCount} | ${coveragePercent}% |`,
``,
`## Scenarios by Feature`,
``,
];
const features = [...new Set(plan.scenarios.map(s => s.feature))];
for (const feature of features) {
lines.push(`### ${feature}`);
lines.push('');
const featureScenarios = plan.scenarios.filter(s => s.feature === feature);
for (const s of featureScenarios) {
const status = s.covered ? '✅' : '❌';
const priority = s.priority === 'high' ? '🔴' : s.priority === 'medium' ? '🟡' : '🟢';
lines.push(`- ${status} ${priority} ${s.scenario}`);
if (!s.covered) {
lines.push(` - Add to: \`${s.suggestedTestFile}\``);
}
}
lines.push('');
}
fs.writeFileSync(outputPath, lines.join('\n'), 'utf-8');
console.log(`✅ Test plan written to ${outputPath}`);
console.log(` Coverage: ${coveragePercent}% (${plan.coveredCount}/${plan.totalScenarios})`);
}
}
The most practical AI feature in a real test suite. When a locator breaks because a developer changed a data-testid
or renamed a button — instead of the test just failing, the healer automatically finds the best alternative locator and patches the source file.
// ai/TestHealer.ts
import * as fs from 'fs';
import * as path from 'path';
import { Page, Locator } from '@playwright/test';
export interface HealResult {
originalSelector: string;
healedSelector: string;
confidence: 'high' | 'medium' | 'low';
strategy: string;
}
export interface HealReport {
file: string;
healed: HealResult[];
failed: string[];
}
export class TestHealer {
/**
* Attempt to find an element using fallback strategies when the primary locator fails.
*
* Healing strategy priority:
* 1. getByTestId (data-testid — most stable)
* 2. getByRole + name (semantic — stable)
* 3. getByLabel (form elements)
* 4. getByText (exact visible text)
* 5. CSS selector (last resort)
*/
static async healLocator(
page: Page,
originalSelector: string,
context?: {
expectedText?: string;
expectedRole?: string;
nearText?: string;
}
): Promise<HealResult | null> {
// Strategy 1: Try getByTestId variations
if (context?.expectedText) {
const testIdVariants = [
context.expectedText.toLowerCase().replace(/\s+/g, '-'),
context.expectedText.toLowerCase().replace(/\s+/g, '_'),
context.expectedText.toLowerCase().replace(/\s+/g, ''),
];
for (const testId of testIdVariants) {
const locator = page.getByTestId(testId);
if (await locator.count() > 0) {
return {
originalSelector,
healedSelector: `page.getByTestId('${testId}')`,
confidence: 'high',
strategy: 'data-testid variant',
};
}
}
}
// Strategy 2: getByRole + name
if (context?.expectedRole && context?.expectedText) {
const roleLocator = page.getByRole(
context.expectedRole as 'button' | 'link' | 'heading' | 'listitem',
{ name: context.expectedText }
);
if (await roleLocator.count() === 1) {
return {
originalSelector,
healedSelector: `page.getByRole('${context.expectedRole}', { name: '${context.expectedText}' })`,
confidence: 'high',
strategy: 'role + name',
};
}
}
// Strategy 3: getByText (exact)
if (context?.expectedText) {
const textLocator = page.getByText(context.expectedText, { exact: true });
if (await textLocator.count() === 1) {
return {
originalSelector,
healedSelector: `page.getByText('${context.expectedText}', { exact: true })`,
confidence: 'medium',
strategy: 'exact text match',
};
}
// Partial text match
const partialLocator = page.getByText(context.expectedText);
if (await partialLocator.count() === 1) {
return {
originalSelector,
healedSelector: `page.getByText('${context.expectedText}')`,
confidence: 'low',
strategy: 'partial text match',
};
}
}
// Strategy 4: Try to find near a known element
if (context?.nearText) {
const nearLocator = page.locator(`:near(:text("${context.nearText}"))`);
if (await nearLocator.count() > 0) {
return {
originalSelector,
healedSelector: `page.locator(':near(:text("${context.nearText}"))').first()`,
confidence: 'low',
strategy: 'proximity to known text',
};
}
}
return null;
}
/**
* Scan test files for broken selectors after a test run.
* Returns a report of what was healed and what couldn't be fixed.
*/
static scanForBrokenSelectors(testResultsPath: string): string[] {
const brokenSelectors: string[] = [];
if (!fs.existsSync(testResultsPath)) {
return brokenSelectors;
}
const results = JSON.parse(fs.readFileSync(testResultsPath, 'utf-8'));
// Extract locator errors from test results
for (const suite of results.suites ?? []) {
for (const spec of suite.specs ?? []) {
if (!spec.ok) {
for (const test of spec.tests ?? []) {
for (const result of test.results ?? []) {
const error = result.error?.message ?? '';
// Match common locator failure messages
const locatorMatch = error.match(
/waiting for (locator|getByTestId|getByRole|getByText)\(['"`](.+?)['"`]\)/
);
if (locatorMatch) {
brokenSelectors.push(locatorMatch[0]);
}
}
}
}
}
}
return [...new Set(brokenSelectors)]; // deduplicate
}
/**
* Patch a test file — replace an old selector with a healed one.
*/
static patchTestFile(
filePath: string,
original: string,
healed: string
): boolean {
if (!fs.existsSync(filePath)) return false;
const content = fs.readFileSync(filePath, 'utf-8');
if (!content.includes(original)) return false;
const patched = content.replace(new RegExp(escapeRegex(original), 'g'), healed);
fs.writeFileSync(filePath, patched, 'utf-8');
console.log(` ✅ Patched: ${path.basename(filePath)}`);
console.log(` Before: ${original}`);
console.log(` After: ${healed}`);
return true;
}
}
function escapeRegex(str: string): string {
return str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}
js
// scripts/generate-test-plan.ts
import { TestPlanner } from '../ai/TestPlanner';
import * as path from 'path';
import * as dotenv from 'dotenv';
dotenv.config();
async function main() {
const baseUrl = process.env.BASE_URL || 'http://localhost:3000';
const outputPath = path.join(process.cwd(), 'TEST_PLAN.md');
console.log(`🤖 Generating test plan for ${baseUrl}...`);
console.log(' Crawling app and scanning existing tests...\n');
const planner = new TestPlanner(baseUrl, './tests');
await planner.writePlanMarkdown(outputPath);
console.log('\n📋 Test plan saved to TEST_PLAN.md');
console.log(' Review uncovered scenarios and add missing tests.\n');
}
main().catch(err => {
console.error('Test plan generation failed:', err);
process.exit(1);
});
Run it anytime to get an up-to-date picture of coverage:
npx ts-node scripts/generate-test-plan.ts
Output:
Generated: 18/06/2026, 09:15:00
## Coverage Summary
| Total Scenarios | Covered | Uncovered | Coverage |
|-----------------|---------|-----------|----------|
| 21 | ✅ 17 | ❌ 4 | 81% |
## Scenarios by Feature
### Authentication
- ✅ 🔴 successful login redirects to dashboard
- ✅ 🔴 invalid credentials show error message
- ✅ 🔴 empty fields show validation errors
- ❌ 🔴 logout clears session
- Add to: `tests/auth/login.spec.ts`
...
js
// scripts/heal-selectors.ts
import { chromium } from '@playwright/test';
import { TestHealer } from '../ai/TestHealer';
import * as path from 'path';
import * as dotenv from 'dotenv';
dotenv.config();
/**
* Run this after a test failure to attempt automatic selector healing.
*
* Usage:
* npx ts-node scripts/heal-selectors.ts
*
* What it does:
* 1. Reads test-results/results.json for broken selectors
* 2. Launches a browser and tries alternative locator strategies
* 3. Patches test files with healed selectors
* 4. Prints a healing report
*/
async function main() {
const resultsPath = path.join(process.cwd(), 'test-results', 'results.json');
const baseUrl = process.env.BASE_URL || 'http://localhost:3000';
console.log('🔧 Playwright Self-Healing Selector Tool\n');
// Step 1 — Find broken selectors from test results
const broken = TestHealer.scanForBrokenSelectors(resultsPath);
if (broken.length === 0) {
console.log('✅ No broken selectors found in test results.');
return;
}
console.log(`Found ${broken.length} broken selector(s):\n`);
broken.forEach(s => console.log(` ❌ ${s}`));
console.log('');
// Step 2 — Launch browser and try to heal each one
const browser = await chromium.launch({ headless: true });
const context = await browser.newContext({
storageState: '.auth/admin.json',
});
const page = await context.newPage();
const report: { healed: number; failed: number } = { healed: 0, failed: 0 };
for (const selector of broken) {
console.log(`\n🔍 Attempting to heal: ${selector}`);
await page.goto(`${baseUrl}/tasks`, { waitUntil: 'networkidle' });
// Extract context clues from the selector string
const textMatch = selector.match(/['"`]([^'"`]+)['"`]/);
const expectedText = textMatch?.[1];
const result = await TestHealer.healLocator(page, selector, {
expectedText,
expectedRole: selector.includes('button') ? 'button' : undefined,
});
if (result) {
console.log(` ✅ Healed with strategy: ${result.strategy}`);
console.log(` Confidence: ${result.confidence}`);
console.log(` New selector: ${result.healedSelector}`);
report.healed++;
} else {
console.log(` ❌ Could not automatically heal this selector.`);
console.log(` Manual fix required.`);
report.failed++;
}
}
await browser.close();
// Step 3 — Summary
console.log('\n─────────────────────────────────────');
console.log('🔧 Healing Summary');
console.log(` ✅ Healed: ${report.healed}`);
console.log(` ❌ Failed: ${report.failed}`);
console.log('─────────────────────────────────────');
if (report.healed > 0) {
console.log('\n⚠️ Review all healed selectors before committing.');
console.log(' Run npx playwright test to verify the fixes.');
}
}
main().catch(err => {
console.error('Healer failed:', err);
process.exit(1);
});
Here's what AI-generated tests actually look like when you use MCP properly — and where the gaps are.
// tests/ai/ai-generated.spec.ts
// These tests were generated with Playwright MCP + Claude
// Reviewed and validated by a human before being committed
// Note: AI generated the structure — human verified selectors and assertions
import { test, expect } from '@playwright/test';
import { TaskPage } from '../../pages/TaskPage';
import { DashboardPage } from '../../pages/DashboardPage';
// ✅ Good AI generation — simple, well-structured, uses existing POM
test('task title is displayed in the task list after creation', async ({ page }) => {
const taskPage = new TaskPage(page);
await taskPage.goto();
await taskPage.createTask('AI generated test task');
await expect(taskPage.getTaskLocator('AI generated test task')).toBeVisible();
await expect(page.getByRole('listitem').filter({ hasText: 'AI generated test task' }))
.toBeVisible();
await taskPage.deleteTask('AI generated test task');
});
// ✅ Good — AI correctly identified the dashboard navigation flow
test('clicking task from dashboard navigates to task detail', async ({ page }) => {
const dashboard = new DashboardPage(page);
await dashboard.goto();
// AI correctly observed this interaction via MCP
const firstTask = page.getByRole('listitem').first();
const taskTitle = await firstTask.getByTestId('task-title').textContent();
await firstTask.getByRole('link', { name: 'View details' }).click();
await expect(page).toHaveURL(/\/tasks\/\d+/);
await expect(page.getByRole('heading', { level: 1 })).toHaveText(taskTitle ?? '');
});
// ✅ Good — AI found the filter interaction correctly
test('filtering tasks by status shows only matching tasks', async ({ page }) => {
const taskPage = new TaskPage(page);
await taskPage.goto();
// AI observed the filter dropdown via MCP browser exploration
await page.getByRole('combobox', { name: 'Filter by status' }).selectOption('completed');
const taskItems = page.getByRole('listitem');
const count = await taskItems.count();
for (let i = 0; i < count; i++) {
await expect(taskItems.nth(i).getByTestId('status-badge')).toHaveText('Completed');
}
});
// ⚠️ Needs human review — AI generated this but assertion is weak
// TODO: Strengthen the assertion to check specific error text
test('task creation form shows validation when title is empty', async ({ page }) => {
const taskPage = new TaskPage(page);
await taskPage.goto();
await taskPage.newTaskButton.click();
// AI correctly identified save button but missed the specific error selector
await taskPage.saveTaskButton.click();
// ⚠️ AI used a generic assertion here — human should make this more specific
await expect(page.getByRole('alert')).toBeVisible();
// TODO: Replace with: await expect(page.getByText('Title is required')).toBeVisible();
});
// ❌ AI got this wrong — generated a selector that doesn't exist
// Left here deliberately to show where AI fails
// test('admin can bulk delete tasks', async ({ page }) => {
// AI generated: await page.getByTestId('bulk-select-all').click();
// Problem: this element doesn't exist in our app
// AI hallucinated a feature that isn't there
// This is why you always review AI-generated tests before running
// });
The commented-out test at the bottom is important. It shows the failure mode honestly — AI sometimes generates tests for features it thinks exist but don't. The lesson: AI generates test scaffolding. Humans validate it.
Here's the honest breakdown of where AI helps and where it doesn't:
✅ AI is excellent for:
├── Generating test skeletons for new features fast
├── Exploring an unfamiliar codebase via MCP
├── Suggesting test scenarios you hadn't thought of
├── Writing boilerplate (beforeEach, fixtures, imports)
├── Generating test data variations
└── Coverage gap analysis (TestPlanner.ts)
⚠️ AI needs human oversight for:
├── Verifying generated selectors actually exist in the DOM
├── Checking assertions are strong enough (not just toBeVisible)
├── Ensuring tests are truly independent (AI loves shared state)
├── Business logic assertions (AI doesn't know your rules)
└── Any security or permissions testing
❌ AI should not be trusted for:
├── Testing non-deterministic behaviour without human review
├── Deciding what constitutes a "pass" for complex flows
├── Replacing the understanding of what actually matters to test
└── Automatically committing healed selectors without review
The engineers who get the most value from AI tools are the ones who know these boundaries. They use AI to go faster on the tasks it does well. They apply their own judgment on the tasks it doesn't. 🎯
package.json
Scripts
Add these to your package.json
to make the AI tools easy to run:
{
"scripts": {
"test": "playwright test",
"test:headed": "playwright test --headed",
"test:debug": "playwright test --debug",
"test:ci": "playwright test --project=admin --project=user --project=multi-context --project=api",
"test:visual": "playwright test --project=visual --project=visual-responsive",
"test:api": "playwright test --project=api",
"test:update-snapshots": "playwright test --project=visual --update-snapshots",
"report": "playwright show-report",
"record-har": "ts-node scripts/record-har.ts",
"heal": "ts-node scripts/heal-selectors.ts",
"plan": "ts-node scripts/generate-test-plan.ts",
"notify": "ts-node scripts/notify-slack.ts"
}
}
Every file across all 8 parts — fully built, fully documented:
playwright-playbook/
├── .github/
│ └── workflows/
│ ├── playwright.yml ✅ Part 7
│ └── playwright-visual.yml ✅ Part 7
├── tests/
│ ├── auth/
│ │ └── login.spec.ts ✅ Part 1
│ ├── tasks/
│ │ └── task-management.spec.ts ✅ Part 1
│ ├── network/ ✅ Part 2
│ │ ├── api-mocking.spec.ts
│ │ ├── error-simulation.spec.ts
│ │ └── network-assertions.spec.ts
│ ├── multi-user/ ✅ Part 3
│ │ ├── role-permissions.spec.ts
│ │ └── realtime-collaboration.spec.ts
│ ├── multi-tab/ ✅ Part 3
│ │ └── multi-tab-flows.spec.ts
│ ├── api/ ✅ Part 4
│ │ ├── tasks-api.spec.ts
│ │ ├── auth-api.spec.ts
│ │ ├── graphql-api.spec.ts
│ │ └── api-ui-chain.spec.ts
│ ├── visual/ ✅ Part 5
│ │ ├── dashboard-visual.spec.ts
│ │ ├── task-visual.spec.ts
│ │ └── responsive-visual.spec.ts
│ ├── debug/ ✅ Part 6
│ │ └── trace-examples.spec.ts
│ └── ai/ ✅ Part 8
│ └── ai-generated.spec.ts
├── pages/
│ ├── LoginPage.ts ✅ Part 1
│ ├── TaskPage.ts ✅ Part 1
│ └── DashboardPage.ts ✅ Part 3
├── api/
│ ├── TaskApiClient.ts ✅ Part 4
│ └── AuthApiClient.ts ✅ Part 4
├── ai/ ✅ Part 8
│ ├── TestPlanner.ts
│ └── TestHealer.ts
├── fixtures/
│ ├── auth.fixture.ts ✅ Part 1
│ ├── tasks.json ✅ Part 2
│ ├── empty-tasks.json ✅ Part 2
│ ├── tasks-har.har ✅ Part 2
│ ├── multi-user.fixture.ts ✅ Part 3
│ └── api.fixture.ts ✅ Part 4
├── scripts/
│ ├── record-har.ts ✅ Part 2
│ ├── notify-slack.ts ✅ Part 7
│ ├── generate-test-plan.ts ✅ Part 8
│ └── heal-selectors.ts ✅ Part 8
├── utils/
│ ├── schema-validator.ts ✅ Part 4
│ ├── visual-helpers.ts ✅ Part 5
│ └── debug-helpers.ts ✅ Part 6
├── docker/ ✅ Part 7
│ ├── Dockerfile
│ └── docker-compose.yml
├── snapshots/ ✅ Part 5
├── .vscode/ ✅ Part 6
│ ├── extensions.json
│ └── launch.json
├── .auth/ ← git-ignored
│ ├── admin.json
│ └── user.json
├── .mcp.json ✅ Part 8
├── global-setup.ts ✅ Part 1
├── playwright.config.ts ✅ Parts 1–7 (final)
├── .gitignore ✅ Part 7
├── .env ← git-ignored
├── .env.example ← committed
├── TEST_PLAN.md ← generated by script
└── package.json
Part 1 — Stop Writing Tests Like a Beginner ✅ Done
Part 2 — Network Interception: The Complete Guide ✅ Done
Part 3 — Multi-User, Multi-Tab & Context Testing ✅ Done
Part 4 — API Testing (The Underrated Superpower) ✅ Done
Part 5 — Visual Regression Testing ✅ Done
Part 6 — Debugging Like a Pro: Trace Viewer & Inspector ✅ Done
Part 7 — The CI/CD Setup Nobody Shows You ✅ Done
Part 8 — Playwright Meets AI: Agents, MCP & Self-Healing ✅ Done
Eight parts. One complete framework.
Look at what you now have:
The foundation (Parts 1–2)
Semantic locators. storageState
. Page Object Model. Full network interception layer. HAR recording. API mocking.
The depth (Parts 3–4)
Multi-user simultaneous testing. Real-time collaboration validation. Raw API testing with typed clients. Schema validation. GraphQL testing. API-UI chaining.
The quality layer (Parts 5–6)
Visual regression across full pages, components, and four viewports. Trace Viewer integration. Debug helpers that catch console errors and audit network calls automatically. VS Code launch configs.
The pipeline (Part 7)
Sharded GitHub Actions workflow — 4x faster. Cross-browser matrix. Docker for consistent VRT. Slack notifications. Published HTML reports as artifacts.
The AI layer (Part 8)
Playwright MCP — giving AI assistants a real browser. Test coverage analysis. Self-healing selector engine. AI-generated test scaffolding with honest evaluation of where it works and where it doesn't.
This isn't a toy. This is a production-grade Playwright framework — the kind teams spend months building organically.
You built it in 8 parts. 🏆
This series covered the framework. But a framework is only as good as the team using it.
If this series helped you — share it with your team. Point them to Part 1. Let them follow along from the beginning.
And if you want to go deeper on any specific part — drop a comment. The next series is already forming based on what you ask.
Thank you for following The Playwright Playbook from start to finish.
This is the series I wish had existed when I started testing AI systems. I hope it gave you what you needed to build something real.
Drop a comment below 👇
It's been a great ride. Let's keep building. 🙌
Faizal Shaikh | Senior Automation Engineer | Playwright & AI Testing
Connect with me on LinkedIn