Writing Great Gherkin Scenarios: Best Practices and Common Mistakes

Writing Great Gherkin Scenarios: Best Practices and Common Mistakes

Gherkin is the language that makes Behavior-Driven Development work. When teams write it well, feature files become living documentation that business stakeholders can read and developers can execute. When teams write it poorly, Gherkin becomes a maintenance nightmare full of brittle, overlapping step definitions and scenarios that no one outside engineering can understand.

This guide covers the practical patterns and anti-patterns that separate great Gherkin from mediocre Gherkin — with real examples you can adapt for your own projects.

What Gherkin Actually Is

Gherkin is a domain-specific language designed to express software behavior in plain English (or many other natural languages). It uses a small set of keywords — Feature, Scenario, Given, When, Then, And, But, Background, Scenario Outline, Examples, and Rule — to structure test cases as conversations between a user and a system.

The three core keywords form the backbone of every scenario:

  • Given — the precondition or starting state
  • When — the action the user takes
  • Then — the expected observable outcome
Feature: User Authentication

  Scenario: Successful login with valid credentials
    Given a registered user with email "alice@example.com" and password "SecurePass123"
    When the user submits the login form
    Then the user is redirected to the dashboard
    And the welcome message displays "Welcome back, Alice"

This is clean Gherkin. Now let's dig into what makes it clean and how to keep it that way.

Declarative vs Imperative Scenarios

The single biggest quality distinction in Gherkin is the difference between declarative and imperative scenarios.

Imperative scenarios describe how the user interacts with the system — the clicks, the form fields, the exact UI mechanics:

# Imperative — avoid this style
Scenario: Login
  Given the user navigates to "https://app.example.com/login"
  When the user clicks on the "Email" field
  And the user types "alice@example.com"
  And the user clicks on the "Password" field
  And the user types "SecurePass123"
  And the user clicks the "Sign In" button
  Then the URL should be "https://app.example.com/dashboard"

Declarative scenarios describe what the user wants to accomplish — the intent, not the mechanics:

# Declarative — prefer this style
Scenario: Login with valid credentials
  Given Alice is a registered user
  When she logs in with her credentials
  Then she sees her personal dashboard

Declarative scenarios are shorter, more readable, and far more resilient to UI changes. When the login page gets redesigned, you change one step definition — not thirty scenarios. The "how" lives in the step implementation, not in the feature file.

A good test: can a product manager or business analyst read your scenario and immediately understand what it's testing? If yes, you're writing declaratively. If they'd need to consult a UI designer to interpret it, you've gone too imperative.

Writing Effective Background Steps

The Background keyword lets you define steps that run before every scenario in a feature file. Use it for shared preconditions — but use it carefully.

Feature: Shopping Cart

  Background:
    Given the user is logged in as "customer"
    And the product catalog contains at least 10 items

  Scenario: Adding an item to an empty cart
    When the user adds "Wireless Headphones" to the cart
    Then the cart displays 1 item
    And the cart total shows "$89.99"

  Scenario: Adding duplicate items increases quantity
    Given the cart already contains 1 "Wireless Headphones"
    When the user adds "Wireless Headphones" to the cart again
    Then the cart displays 2 items
    And the cart total shows "$179.98"

Background best practices:

  1. Keep it short — two or three steps maximum
  2. Only include context that genuinely applies to all scenarios in the file
  3. Do not use Background to hide important scenario-specific context
  4. If you find yourself writing "Given the user is NOT logged in" as a one-off exception, that scenario probably belongs in a different feature file

The Background anti-pattern is stuffing so much into Background that readers must scroll up repeatedly to understand each scenario:

# Anti-pattern: Background that's longer than the scenarios
Background:
  Given the system is initialized
  And the database is seeded with test data
  And the email service is mocked
  And the payment gateway is in sandbox mode
  And three user accounts exist: "admin", "customer", "guest"
  And the feature flag "new-checkout" is enabled
  And the user is logged in as "customer"
  And the shopping cart is empty

This background creates hidden dependencies and makes scenarios hard to read in isolation. Move the specific setup into the scenarios that need it.

Scenario Outlines: Testing Multiple Data Combinations

When you need to run the same behavior with different data, Scenario Outline (also called Scenario Template in some implementations) eliminates duplication:

Feature: Password Validation

  Scenario Outline: Reject invalid passwords during registration
    Given a user is completing registration
    When they enter the password "<password>"
    Then they see the error "<error_message>"

    Examples:
      | password     | error_message                            |
      | abc          | Password must be at least 8 characters   |
      | password     | Password must contain a number           |
      | 12345678     | Password must contain a letter           |
      | Password1    | Password must contain a special character|
      | P@ssword1    |                                          |

  Scenario Outline: Accept valid passwords
    Given a user is completing registration
    When they enter the password "<password>"
    Then registration proceeds without password errors

    Examples:
      | password      |
      | P@ssword1!    |
      | Secur3#Pass   |
      | MyStr0ng!pwd  |

Notice the empty error_message in the last row of the invalid passwords table — that's a mistake. Either remove that row or verify it belongs to the valid passwords outline. Data tables should have no ambiguous empty cells.

Tips for Examples tables:

  • Use meaningful column headers that read like plain English
  • Keep the number of columns small — if you need more than four columns, the outline is probably too complex
  • Group related examples in named blocks using @ tags on the Examples section when your tool supports it
  • Test boundary values: minimum, maximum, just-below-minimum, just-above-maximum

Tags: Organizing Your Test Suite

Tags categorize scenarios so you can run subsets of your suite selectively:

@authentication @smoke @critical
Feature: User Login

  @happy-path
  Scenario: Successful login
    Given a registered user
    When they log in with valid credentials
    Then they access the dashboard

  @edge-case @security
  Scenario: Account lockout after failed attempts
    Given a registered user
    When they enter the wrong password 5 times consecutively
    Then the account is locked for 30 minutes
    And they receive a security alert email

  @regression @slow
  Scenario: Login across multiple browser sessions
    Given a user is logged in on Chrome
    When they also log in on Firefox
    Then both sessions are active simultaneously

Tag conventions that work in practice:

Tag Purpose
@smoke Fast sanity check — runs in under 5 minutes
@regression Full regression suite
@critical Business-critical paths — failures block release
@wip Work in progress — excluded from CI
@slow Long-running tests — runs nightly, not on every commit
@security Security-specific scenarios
@flaky Known intermittent failures under investigation

Avoid over-tagging. If every scenario has eight tags, the tags lose their organizational value.

Common Anti-Patterns to Avoid

1. The "And" Chain

Long chains of And steps hide what's actually being tested:

# Anti-pattern
Scenario: Checkout
  Given the user is on the checkout page
  And the cart has 3 items
  And the user has entered their shipping address
  And the user has entered their credit card number
  And the user has accepted the terms of service
  When the user clicks "Place Order"
  Then the order confirmation page is shown
  And an email is sent to the user
  And the inventory is updated
  And the payment is processed
  And the order appears in the admin panel

This scenario is testing too many things. Split it:

# Better
Scenario: Order confirmation email is sent after successful checkout
  Given a user has completed a valid checkout
  When the order is placed successfully
  Then a confirmation email is sent to the user's registered address

Scenario: Inventory decrements after order placement
  Given a product has 10 units in stock
  When a customer orders 3 units
  Then the product shows 7 units available

2. Leaking Implementation Details

# Anti-pattern — exposes database internals
Given the "users" table has a row with id=42, email="alice@example.com", is_active=1
When a GET request is made to "/api/users/42"
Then the response JSON contains "email": "alice@example.com"

This ties your test to the database schema and API implementation. A business stakeholder cannot read it. Instead:

# Better
Given Alice is a registered active user
When her profile is retrieved via the API
Then the response includes her email address

3. Vague Then Steps

# Anti-pattern — what does "works correctly" mean?
Then the system works correctly
Then the page loads
Then everything is fine

Always be specific about the observable outcome:

Then the order status changes to "Processing"
Then the page title reads "Order Confirmed — #ORD-4821"
Then the user receives an email with subject "Your order has been received"

4. One Mega-Scenario

# Anti-pattern — one scenario doing everything
Scenario: Full user journey
  Given a new visitor arrives at the homepage
  When they click "Sign Up"
  And they complete registration
  And they verify their email
  And they log in
  And they add items to the cart
  And they complete checkout
  And they view their order history
  Then the entire flow completes successfully

End-to-end journeys have their place, but they should live in a dedicated e2e feature file tagged @e2e @slow, and most of your scenarios should test individual behaviors.

5. Test Data in Scenario Names

# Anti-pattern
Scenario: User alice@example.com with password P@ssw0rd1 can log in

Scenario names should describe behaviors, not contain data. The data belongs in the steps or Examples table.

Step Reuse and Writing Reusable Steps

Great Gherkin comes from steps that are reusable across scenarios without being so generic they lose meaning.

# Too specific — hard to reuse
Given the user "alice@example.com" with password "SecurePass123" is logged into the application on the main login page

# Too generic — ambiguous
Given the user exists

# Just right
Given Alice is a registered user with standard permissions

Aim for steps that sit at the right abstraction level: specific enough to be unambiguous, general enough to appear in multiple scenarios.

When step definitions start multiplying uncontrollably, audit your feature files. If you have 47 slightly different "Given a user is..." steps, consolidate them using parameterization:

Given a user with role "admin" is logged in
Given a user with role "customer" is logged in
Given a user with role "guest" is logged in

One step definition handles all three:

Given('a user with role {string} is logged in', async (role) => {
  await loginAs(createUserWithRole(role));
});

Putting It All Together

The best way to improve your Gherkin is to read it aloud. If a step sounds unnatural, rewrite it. If a scenario takes more than 30 seconds to explain to a colleague, split it. If a Background section is longer than the scenarios it precedes, something has gone wrong.

Great Gherkin is concise, business-readable, and implementation-agnostic. It focuses on what the system does, not how it does it. It makes test failures immediately understandable and makes new scenarios easy to add.

Invest time in your feature files. They are the documentation that stays up to date because the tests force them to — but only if you keep them clean.

Read more