Playwright AI Test Generator: From Zero to Test Suite with Codegen + AI

Playwright AI Test Generator: From Zero to Test Suite with Codegen + AI

Playwright offers three paths to AI-assisted test generation: classic codegen for manual recording, the new Generator agent for AI-driven creation from a plan, and the @playwright/test-generator programmatic API. Each fits a different workflow. This guide shows all three with real code examples.

Key Takeaways

Classic npx playwright codegen records your actions into a test file. It is the fastest way to bootstrap a test for a known workflow but produces brittle tests that break on any selector change.

The Generator agent reads a Markdown plan and writes tests without you touching the browser. The quality depends on the plan quality — garbage in, garbage out. Review the plan before generating.

Generated tests always need human review. Credentials, test data, shared fixtures, and edge cases are not handled automatically. Generation gets you to 70%; the last 30% requires a developer.

Test generation has been a promise of the browser automation ecosystem for years. Most early approaches — record-and-playback tools from Selenium IDE to Katalon Recorder — produced tests so brittle they became maintenance burdens within weeks. Playwright's approach is different, and the v1.50 AI additions make it substantially more powerful.

This guide covers every AI-assisted generation option available in Playwright today, with working code examples and honest commentary on where each approach falls short.

Option 1: Classic Codegen (Still Valuable)

Before the AI agents, Playwright shipped codegen — a tool that launches a browser and records your interactions into a test file. It remains the fastest way to get a test skeleton for a known, specific workflow.

npx playwright codegen https://your-app.com

This opens a browser with a recorder panel. As you click, type, and navigate, Playwright writes the corresponding test code in real time. When you close the browser, you can copy the generated code or save it directly.

What Codegen Produces

For a login flow, codegen typically generates:

import { test, expect } from '@playwright/test';

test('test', async ({ page }) => {
  await page.goto('https://your-app.com/login');
  await page.locator('#email').fill('user@example.com');
  await page.locator('#password').fill('password123');
  await page.locator('#submit-btn').click();
  await page.waitForURL('**/dashboard');
  await expect(page.locator('.welcome-message')).toBeVisible();
});

Notice the problems immediately: #submit-btn is a fragile ID selector, the test name is 'test', and there are no error scenario branches. Codegen captures what you did, not what you meant.

Codegen with Locator Picker

In newer versions, you can use codegen with the locator picker to inspect specific elements and find stable selectors:

npx playwright codegen --save-storage=auth.json https://your-app.com

This saves cookies and localStorage after recording, which you can reuse for authenticated tests:

test.use({ storageState: 'auth.json' });

test('authenticated flow', async ({ page }) => {
  // already logged in
  await page.goto('/dashboard');
});

When to Use Codegen

Use codegen when:

  • You need to quickly explore what selectors exist on a page
  • You are documenting a bug reproduction scenario
  • You need a starting point for a test you will heavily edit

Do not rely on codegen output as production test code without significant revision.

Option 2: The AI Generator Agent

The Generator agent, introduced in Playwright v1.50, takes a completely different approach. Instead of recording your actions, it reads a Markdown test plan and writes the tests from it.

Prerequisites

npm install -D @playwright/test@latest
npx playwright install chromium
export PLAYWRIGHT_AI_KEY=sk-your-openai-key

Step 1: Create or Generate a Plan

You can write the plan manually or use the Planner agent to generate it:

# Generate a plan using the Planner agent
npx playwright agent plan --url https://your-app.com --output plan.md

Or write it by hand — the format is simple Markdown:

## Checkout Flow

### Scenario: Guest user can complete checkout
1. Navigate to /products
2. Click the first product
3. Click "Add to cart"
4. Navigate to /cart
5. Click "Proceed to checkout"
6. Fill first name, last name, email, address fields
7. Select shipping method "Standard"
8. Click "Place order"
9. Assert: confirmation page shows order number
10. Assert: confirmation email notice is visible

### Scenario: Checkout fails with invalid card
1. Complete steps 1-7 above
2. Enter card number 4000000000000002 (decline test card)
3. Click "Place order"
4. Assert: error message "Your card was declined" is visible
5. Assert: user remains on checkout page

Step 2: Generate Tests

npx playwright agent generate --plan plan.md --output tests/checkout/

The Generator reads each scenario and produces a TypeScript test file:

import { test, expect } from '@playwright/test';

test.describe('Checkout Flow', () => {
  test('Guest user can complete checkout', async ({ page }) => {
    await page.goto('/products');
    await page.locator('.product-card').first().click();
    await page.getByRole('button', { name: 'Add to cart' }).click();
    await page.goto('/cart');
    await page.getByRole('button', { name: 'Proceed to checkout' }).click();
    
    await page.getByLabel('First name').fill('John');
    await page.getByLabel('Last name').fill('Doe');
    await page.getByLabel('Email').fill('john.doe@example.com');
    await page.getByLabel('Address').fill('123 Main St');
    
    await page.getByLabel('Standard').check();
    await page.getByRole('button', { name: 'Place order' }).click();
    
    await expect(page.getByText(/order #\d+/i)).toBeVisible();
    await expect(page.getByText('Confirmation email sent')).toBeVisible();
  });

  test('Checkout fails with invalid card', async ({ page }) => {
    await page.goto('/products');
    await page.locator('.product-card').first().click();
    await page.getByRole('button', { name: 'Add to cart' }).click();
    await page.goto('/cart');
    await page.getByRole('button', { name: 'Proceed to checkout' }).click();
    
    await page.getByLabel('First name').fill('John');
    await page.getByLabel('Last name').fill('Doe');
    await page.getByLabel('Email').fill('john.doe@example.com');
    await page.getByLabel('Address').fill('123 Main St');
    await page.getByLabel('Standard').check();
    
    await page.getByLabel('Card number').fill('4000000000000002');
    await page.getByRole('button', { name: 'Place order' }).click();
    
    await expect(page.getByText('Your card was declined')).toBeVisible();
    await expect(page).toHaveURL(/checkout/);
  });
});

The Generator uses getByRole, getByLabel, and getByText — Playwright's recommended resilient selectors. It does not use #id or .class selectors unless the plan specifically mentions them.

Option 3: Programmatic Test Generation

For teams that want to integrate generation into their own tooling, Playwright exposes the @playwright/test-generator API:

import { generateTests } from '@playwright/test-generator';

const plan = `
## User Registration

### Scenario: New user can register
1. Navigate to /register
2. Fill username, email, password
3. Click "Create account"
4. Assert: redirect to /welcome
`;

const tests = await generateTests(plan, {
  outputDir: './tests/generated',
  language: 'typescript',
  aiKey: process.env.PLAYWRIGHT_AI_KEY,
});

console.log(`Generated ${tests.length} test files`);

This is useful for build scripts, CI pipelines that regenerate tests from a spec document, or custom editors that expose a "generate tests" button.

Generating from OpenAPI Specs

The programmatic API can accept structured input beyond Markdown:

import { generateTestsFromSpec } from '@playwright/test-generator';
import spec from './openapi.json';

const tests = await generateTestsFromSpec(spec, {
  baseUrl: 'https://staging.your-app.com',
  outputDir: './tests/api',
  includeEdgeCases: true,
});

This generates API tests from your OpenAPI schema — each endpoint gets a happy path test and tests for documented error responses.

The Quality Gap: What Generated Tests Miss

Generated tests get you to roughly 70% of what a well-written test suite needs. Here is what the last 30% requires human attention:

Test Data Management

Generated tests use placeholder data. In production you need:

// Generated (fragile):
await page.getByLabel('Email').fill('john.doe@example.com');

// Production (from fixtures):
test('checkout', async ({ page, testUser }) => {
  await page.getByLabel('Email').fill(testUser.email);
});

Your fixture setup, database seeding, and cleanup are not handled by any generator.

Authentication State

Every generated test starts unauthenticated. For suites with many tests behind a login, you need shared authentication state:

// playwright.config.ts
export default {
  projects: [
    {
      name: 'authenticated',
      use: { storageState: 'auth.json' },
    },
  ],
};

Race Conditions and Timing

Generators produce tests against the DOM as it currently exists. They do not account for:

  • API calls that delay rendering
  • Animations that block interactions
  • Feature flags that change UI behavior between environments

You will discover these only when the generated tests run in CI.

Business Logic Edge Cases

A generator cannot know that your checkout flow requires the cart total to exceed $10 before the "Place order" button becomes active. It cannot know that certain email domains are blocked in your signup form. These scenarios must be written by someone who understands the application.

Maintenance Burden: The Real Cost of Generated Tests

Generated tests, even good ones, create a maintenance obligation. Every UI change — a new required field, a renamed button, a redesigned modal — can break multiple generated tests simultaneously.

This is where the Playwright Healer agent becomes important (covered in depth in our self-healing tests guide). But Healer only fixes selector drift. Logic errors require human intervention.

The honest picture: AI test generation significantly reduces the time to write an initial test suite, but does not reduce the ongoing maintenance cost of keeping that suite aligned with a changing application.

HelpMeTest: Persistent, Cloud-Hosted Test Generation

If you want the generation benefits without the local infrastructure, maintenance tooling, and CI setup, HelpMeTest provides all of it as a hosted service at $100/month flat.

Write tests in plain English. Run them on a schedule against your live app. Get notified when they break. The self-healing layer handles selector drift automatically. No Playwright setup, no Node.js version management, no LLM API key configuration.

curl -fsSL https://helpmetest.com/install | bash

For AI coding agents, connect via MCP and let your agent generate and run tests directly:

helpmetest install mcp --claude HELP-your-token-here

Playwright's AI generation tools are excellent for developers who want more out of their local Playwright workflow. HelpMeTest is the choice when you want a test suite that runs persistently, scales to the whole team, and does not require a Playwright expert to maintain.

Read more