Synthetic Monitoring with Playwright: A Practical Implementation Guide

Synthetic Monitoring with Playwright: A Practical Implementation Guide

Playwright is one of the best tools for synthetic monitoring of web applications. It's fast, it supports all major browsers, it handles modern JavaScript applications well, and its API is intuitive enough that you can write a useful monitor in 20 minutes.

This guide covers how to structure Playwright scripts specifically for synthetic monitoring — which has different requirements than tests you run locally during development.

What Makes a Good Synthetic Monitor

Before writing code, understand the constraints. A synthetic monitor runs repeatedly, unattended, in an automated environment. That means:

  • It must be deterministic. Flaky tests are annoying in development; in monitoring they cause alert fatigue and eventually get ignored. If your monitor fails 10% of the time for no reason, you'll stop trusting it.
  • It must be fast. Monitors run frequently. A monitor that takes 45 seconds is expensive and may time out. Aim for under 30 seconds for most flows.
  • It must have meaningful failure messages. When it fails at 3am, the alert message needs to tell you what broke, not just "test failed."
  • It must use stable selectors. Data attributes over CSS classes, role-based selectors over layout-dependent ones.

Setting Up Your First Monitor

Install Playwright if you haven't:

npm install playwright
# or with bun:
bun add playwright

Here's a minimal synthetic monitor for a login flow:

const { chromium } = require('playwright');

async function monitorLogin() {
  const browser = await chromium.launch();
  const page = await browser.newPage();
  
  try {
    // Navigate to login page
    await page.goto('https://app.example.com/login', {
      waitUntil: 'networkidle',
      timeout: 15000
    });
    
    // Verify we're on the right page
    const title = await page.title();
    if (!title.includes('Log In')) {
      throw new Error(`Unexpected page title: "${title}"`);
    }
    
    // Fill and submit the login form
    await page.fill('[data-testid="email"]', process.env.MONITOR_EMAIL);
    await page.fill('[data-testid="password"]', process.env.MONITOR_PASSWORD);
    await page.click('[data-testid="login-button"]');
    
    // Wait for successful redirect
    await page.waitForURL('**/dashboard', { timeout: 10000 });
    
    // Verify dashboard loaded
    await page.waitForSelector('[data-testid="dashboard-content"]', {
      timeout: 5000
    });
    
    console.log('Login monitor: PASS');
  } catch (error) {
    console.error(`Login monitor: FAIL — ${error.message}`);
    process.exit(1);
  } finally {
    await browser.close();
  }
}

monitorLogin();

A few things to notice:

  • Credentials come from environment variables, not hardcoded in the script
  • Each step has an explicit timeout instead of relying on global defaults
  • Error messages include context ("Unexpected page title: X") not just "assertion failed"
  • The script exits with code 1 on failure — this is how your monitoring system knows to trigger an alert

Using HelpMeTest for Playwright Monitoring

If you're running monitors through HelpMeTest, you write Robot Framework tests that use the Playwright integration. The structure is similar, but the scheduling, alerting, and reporting are handled for you.

*** Settings ***
Library    Browser    # Playwright-based Robot Framework library

*** Test Cases ***
Monitor Login Flow
    New Browser    chromium    headless=True
    New Page
    Go To    https://app.example.com/login
    Get Title    ==    Log In — Example App
    Fill Text    [data-testid="email"]    ${MONITOR_EMAIL}
    Fill Text    [data-testid="password"]    ${MONITOR_PASSWORD}
    Click    [data-testid="login-button"]
    Wait For Navigation    url=**/dashboard
    Wait For Elements State    [data-testid="dashboard-content"]    visible
    [Teardown]    Close Browser

HelpMeTest runs this on a schedule, stores results, and alerts when it fails. You write the test once; the infrastructure handles the rest.

Handling Authentication State

Re-authenticating on every monitor run is slow and hammers your auth service. Playwright supports saving browser state — cookies, local storage — and reusing it across runs.

const { chromium } = require('playwright');
const path = require('path');

const STATE_FILE = path.join('/tmp', 'monitor-auth-state.json');

async function ensureAuthenticated() {
  const browser = await chromium.launch();
  const context = await browser.newContext();
  const page = await context.newPage();
  
  await page.goto('https://app.example.com/login');
  await page.fill('[data-testid="email"]', process.env.MONITOR_EMAIL);
  await page.fill('[data-testid="password"]', process.env.MONITOR_PASSWORD);
  await page.click('[data-testid="login-button"]');
  await page.waitForURL('**/dashboard');
  
  // Save auth state
  await context.storageState({ path: STATE_FILE });
  await browser.close();
}

async function monitorWithSavedAuth() {
  const browser = await chromium.launch();
  const context = await browser.newContext({
    storageState: STATE_FILE
  });
  const page = await context.newPage();
  
  // Go directly to the authenticated page — no login needed
  await page.goto('https://app.example.com/dashboard');
  await page.waitForSelector('[data-testid="dashboard-content"]');
  
  // ... rest of your monitor
  await browser.close();
}

Run ensureAuthenticated() once (or daily, depending on session length), then use monitorWithSavedAuth() for the frequent checks.

Monitoring Multi-Step User Flows

The real value of Playwright for synthetic monitoring is testing complete user journeys, not just single pages. Here's a more complete example monitoring an e-commerce checkout flow:

async function monitorCheckout(page) {
  const startTime = Date.now();
  const timings = {};
  
  // Search for a product
  await page.goto('https://shop.example.com');
  await page.fill('[data-testid="search-input"]', 'test product');
  await page.press('[data-testid="search-input"]', 'Enter');
  await page.waitForSelector('[data-testid="search-results"]');
  timings.search = Date.now() - startTime;
  
  // Open product page
  await page.click('[data-testid="product-card"]:first-child');
  await page.waitForSelector('[data-testid="product-price"]');
  timings.productPage = Date.now() - startTime;
  
  // Add to cart
  await page.click('[data-testid="add-to-cart"]');
  await page.waitForSelector('[data-testid="cart-count"][data-count="1"]');
  timings.addToCart = Date.now() - startTime;
  
  // Proceed to checkout
  await page.click('[data-testid="checkout-button"]');
  await page.waitForURL('**/checkout');
  timings.checkout = Date.now() - startTime;
  
  // Verify checkout form is present
  const hasAddressForm = await page.isVisible('[data-testid="address-form"]');
  if (!hasAddressForm) {
    throw new Error('Checkout page missing address form');
  }
  
  console.log('Checkout flow timings:', timings);
  
  // Alert if total flow took more than 10 seconds
  if (timings.checkout > 10000) {
    throw new Error(`Checkout flow too slow: ${timings.checkout}ms`);
  }
}

Notice the timing instrumentation. Knowing a step failed is important; knowing that the checkout step started taking 3x longer than usual is equally important and often comes before a full failure.

Handling Flakiness

The most common cause of flaky synthetic monitors is timing issues — not waiting long enough for elements to appear or actions to complete. Playwright's default waiting is usually good, but synthetic monitors run against production systems under real load.

Use waitForSelector with explicit timeouts:

// Fragile — may fail under load
await page.click('[data-testid="submit"]');
const result = await page.$('[data-testid="success-message"]');

// Better — waits up to 8 seconds for the element to appear
await page.click('[data-testid="submit"]');
await page.waitForSelector('[data-testid="success-message"]', { timeout: 8000 });

Wait for network idle when appropriate:

// For pages that do heavy data fetching on load
await page.goto('https://app.example.com/dashboard', {
  waitUntil: 'networkidle'
});

Retry the whole monitor on transient failure:

async function runWithRetry(monitorFn, maxRetries = 2) {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      await monitorFn();
      return;
    } catch (error) {
      if (attempt === maxRetries) throw error;
      console.log(`Attempt ${attempt} failed, retrying: ${error.message}`);
      await new Promise(resolve => setTimeout(resolve, 2000));
    }
  }
}

One retry with a short delay eliminates most transient failures from network hiccups or momentary server load spikes. More than two retries and you're masking real problems.

Capturing Screenshots on Failure

A screenshot at the moment of failure is worth a thousand log lines when debugging a 3am alert:

async function monitorWithScreenshot(page, monitorFn) {
  try {
    await monitorFn(page);
  } catch (error) {
    const screenshotPath = `/tmp/monitor-failure-${Date.now()}.png`;
    await page.screenshot({ path: screenshotPath, fullPage: true });
    console.error(`Monitor failed. Screenshot: ${screenshotPath}`);
    throw error;
  }
}

HelpMeTest automatically captures screenshots on test failure — you don't need to add this manually if you're using it as your monitoring platform.

Structuring Your Monitor Suite

Once you have more than a few monitors, organization matters. A practical structure:

monitors/
  critical/
    login.js          # Runs every 1 minute
    checkout.js       # Runs every 1 minute
    api-health.js     # Runs every 30 seconds
  standard/
    search.js         # Runs every 5 minutes
    user-profile.js   # Runs every 5 minutes
    notifications.js  # Runs every 5 minutes
  daily/
    email-delivery.js # Runs every hour
    data-export.js    # Runs every hour

Critical monitors run more frequently and page on-call immediately on failure. Standard monitors alert the team channel. Daily monitors create tickets. The frequency and severity of alerting should match the business impact of the monitored flow.

Running Playwright Monitors in CI

Add your monitors to your deployment pipeline to catch regressions before production:

# .github/workflows/synthetic-check.yml
name: Synthetic Monitor Check

on:
  push:
    branches: [main]
  schedule:
    - cron: '*/5 * * * *'  # Every 5 minutes

jobs:
  monitor:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-node@v3
      - run: npm install
      - run: npx playwright install chromium
      - run: node monitors/critical/login.js
        env:
          MONITOR_EMAIL: ${{ secrets.MONITOR_EMAIL }}
          MONITOR_PASSWORD: ${{ secrets.MONITOR_PASSWORD }}

This runs your monitors on every push to main and on a schedule. If the login monitor fails on a push, the deployment pipeline fails — the breakage doesn't reach users.

Playwright is a solid foundation for synthetic monitoring. Write clean scripts, instrument timings, handle flakiness with explicit waits and retries, and organize your suite by criticality. Start with your three most business-critical flows and expand from there.

Read more