Developers

PRD to Test Harness: A Practical Guide

HelpMeTest

18 Mar 2026 — 7 min read

A PRD describes what your product should do. A test harness verifies that it does. Right now, most teams have both — and they're completely disconnected. This guide walks through exactly how to convert a requirements document into a HelpMeTest verification harness that runs continuously against your production environment.

Key Takeaways

The PRD-to-harness gap is the source of most silent failures. Requirements live in Notion or Confluence. Tests live in a code repo. Nothing connects them. When behavior breaks, neither system notices.

You don't need to translate every line. Good requirements coverage means every user-facing behavior has a check. Not every line of code has a unit test.

Generate before you implement. The most valuable time to create the harness is before the feature is built. Failing tests become the specification for your AI coding agent.

Operational requirements need health checks, not browser tests. "Service must be available" and "job runs every 5 minutes" are requirements too. HelpMeTest handles these with heartbeat monitoring.

The harness runs forever. Once created, it verifies your requirements every 5 minutes — in production. Not just at CI time. Not just when someone runs it manually.

The Gap That Costs You

You have a PRD. It describes the feature. Your team built the feature. Now the PRD sits in a document, the code sits in a repo, and nothing connects them.

When a requirement breaks — an edge case, an interaction, a constraint — neither the PRD nor the tests catch it. The PRD doesn't run. The tests only catch what someone happened to test. The gap between what was specified and what gets verified is invisible until a user finds it.

This guide closes that gap. Step by step.

The approach comes from Requirements as Tests: instead of writing tests from code, derive checks from requirements. Every requirement that exists gets a check. Every check runs continuously.

Step 1: Audit Your Requirements

Before translating anything, you need to know what you're working with.

Open your requirements document — PRD, user stories, Jira tickets, whatever you use. Go through it with one question: does this describe observable behavior?

Observable behavior means: a user does something, something happens that can be verified. "Users can add items to cart" is observable. "The checkout flow should feel smooth" is not.

Sort your requirements into three buckets:

Behavioral requirements — user-facing actions with verifiable outcomes:

"Users can add items to cart"
"Cart persists across page refreshes"
"Email is sent when order is placed"
"Search results update without page reload"

Operational requirements — system behaviors with timing/availability constraints:

"API responds within 2 seconds"
"Order processing job runs every 5 minutes"
"Service is available 99.9% of the time"

Non-testable requirements — design, tone, vague quality attributes:

"The interface should be intuitive"
"Performance should be good"

You'll create harness tests for the first bucket, health checks for the second, and skip the third. Don't try to test everything — test the things that can break without anyone noticing.

Step 2: Structure Requirements as Acceptance Criteria

Before generating any tests, sharpen your requirements into concrete acceptance criteria. This is the step most teams skip, and it's why tests end up vague.

For each behavioral requirement, write:

Given: the starting state
When: the user action
Then: the expected result (specific and verifiable)

Example:

Requirement: Cart persists across page refreshes

Given: A user has added 1 item to their cart
When: They reload the page
Then: The cart still shows 1 item

Requirement: Out-of-stock items cannot be added

Given: A user is on a product page for an out-of-stock item
When: They attempt to click "Add to Cart"
Then: The button is disabled or absent

The more specific the "Then," the better the generated test. "The cart still shows 1 item" is testable. "The cart works correctly" is not.

Do this for every behavioral requirement before moving to Step 3.

Step 3: Store Requirements in HelpMeTest Artifacts

HelpMeTest's Artifacts system stores your requirements as structured context that the AI uses when generating tests.

Install the CLI and connect your AI tool:

curl -fsSL https://helpmetest.com/install | bash
helpmetest login
helpmetest install mcp --claude HELP-your-token

Then, in HelpMeTest, create an Artifact for the feature. Paste in:

The feature name and description
The acceptance criteria you wrote in Step 2
Any relevant constraints (user roles, data states, environment specifics)

This becomes the source of truth for test generation. When the AI generates tests, it reads from this Artifact — not from the codebase, not from your memory.

Step 4: Generate the Behavioral Test Harness

With requirements stored, ask HelpMeTest's AI to generate the verification harness. The prompt:

"Generate HelpMeTest tests for [feature name] based on the requirements in the [feature] Artifact. Cover all acceptance criteria. Use Robot Framework + Playwright syntax."

The AI works from requirements, not code. If the requirement is "cart persists across refresh," it generates a test for that — regardless of whether the implementation exists yet.

A typical harness for the cart feature looks like:

*** Test Cases ***

Add Item To Cart
    Go To  https://shop.example.com/product/widget
    Wait For Elements State  button[data-testid="add-to-cart"]  visible  timeout=10s
    Click  button[data-testid="add-to-cart"]
    Wait For Elements State  .cart-notification  visible  timeout=5s
    Get Text  .cart-count  ==  1

Cart Persists Across Page Refresh
    Go To  https://shop.example.com/product/widget
    Click  button[data-testid="add-to-cart"]
    Wait For Elements State  .cart-count  visible  timeout=5s
    Reload
    Get Text  .cart-count  ==  1

Out Of Stock Item Cannot Be Added
    Go To  https://shop.example.com/product/out-of-stock
    Wait For Elements State  button[data-testid="add-to-cart"]  visible  timeout=10s
    ${disabled}=  Get Attribute  button[data-testid="add-to-cart"]  disabled
    Should Not Be Empty  ${disabled}

Cart Count Shows In Header
    Go To  https://shop.example.com
    ${before}=  Get Text  .cart-header-count
    Go To  https://shop.example.com/product/widget
    Click  button[data-testid="add-to-cart"]
    ${after}=  Get Text  .cart-header-count
    Should Not Be Equal  ${before}  ${after}

Each test corresponds directly to a requirement. If you have 6 requirements, you have (at minimum) 6 tests. The mapping is intentional and explicit.

Step 5: Add Operational Health Checks

Behavioral tests cover user-facing behavior. Operational requirements — availability, timing, processing guarantees — are different. They're not browser flows. They're system behaviors that need heartbeat monitoring.

For each operational requirement, set up a health check:

# "Order processing job runs every 5 minutes"
<span class="hljs-comment"># Add this to your job's completion handler:
helpmetest health <span class="hljs-string">"order-processor" <span class="hljs-string">"10m"

<span class="hljs-comment"># "API is available" — add to your API server startup:
helpmetest health <span class="hljs-string">"checkout-api" <span class="hljs-string">"1m"

<span class="hljs-comment"># "Email service is running":
helpmetest health <span class="hljs-string">"email-sender" <span class="hljs-string">"30m"

The grace period (the second argument) is how long HelpMeTest waits before alerting you that the heartbeat is overdue. Set it to roughly 2x the expected interval.

If the heartbeat stops arriving — because the job crashed, the service went down, or the deployment broke something — you get an alert within the grace period. The operational requirement "runs every 5 minutes" becomes a monitored guarantee.

Step 6: Run the Harness Before Implementation

This step is optional but highly recommended, especially if you're using AI coding agents to build features.

Run the harness before telling the agent to implement the feature. Every test will fail — because nothing is built yet. That's correct and expected.

Now use the harness as the agent's specification:

"Implement the shopping cart feature. The tests in HelpMeTest tell you when you're done. Don't stop until all tests pass."

The agent has a concrete, executable definition of done. It can't declare the feature complete while requirements are failing. This is the TDD pattern applied to AI-assisted development: failing tests as specification, passing tests as proof of completion.

For existing features, skip this step and run the harness against what's already built. Failing tests tell you what's broken. Passing tests tell you what's working.

Step 7: Deploy and Let It Run

Once you're satisfied with the harness, it runs on HelpMeTest's schedule — every 5 minutes, against your production environment.

This is the part that matters most. Requirements don't just break during deploys. They break because of:

Third-party API changes
Database migrations
Configuration drift
Background jobs timing out
Dependencies being updated

A CI-only test suite catches deploy-time breakage. A continuously-running requirement harness catches everything else.

When a requirement fails, you get an alert via email or Slack. You know the specific requirement that's broken, when it started failing, and the session replay showing exactly what happened. You fix it before users notice.

What "Done" Means Now

With a PRD-to-harness workflow, "done" has a new definition:

Before: The code is written and the developer's tests pass.

After: Every requirement in the spec has a passing check. The checks run continuously in production.

This is a higher bar. It's also a more honest bar. A feature isn't done because code exists that looks correct. It's done because behavior has been verified against every stated requirement.

The time investment to build the harness for a typical feature is 30–60 minutes. The return is continuous, automated verification that requirements are met — not just when you deploy, but every 5 minutes, forever.

Checklist: PRD to Harness in One Afternoon

Audit requirements — sort into behavioral, operational, non-testable
Write acceptance criteria — Given/When/Then for each behavioral requirement
Store in HelpMeTest Artifacts — requirements as structured context
Generate behavioral tests — one test per requirement
Add health checks — for operational requirements
Run before implementation (if building new) — failing tests as specification
Deploy the harness — runs every 5 minutes against production

# Start here
curl -fsSL https://helpmetest.com/install <span class="hljs-pipe">| bash
helpmetest login
helpmetest install mcp --claude HELP-your-token

For more on why requirements-derived checks catch what traditional tests miss, see Requirements as Tests. For the failure patterns that make this workflow necessary, see What Happens When Your AI Coding Agent Skips a Requirement.