Writing Great Gherkin Scenarios: Best Practices and Common Mistakes
Gherkin is the language that makes Behavior-Driven Development work. When teams write it well, feature files become living documentation that business stakeholders can read and developers can execute. When teams write it poorly, Gherkin becomes a maintenance nightmare full of brittle, overlapping step definitions and scenarios that no one outside engineering can understand.
This guide covers the practical patterns and anti-patterns that separate great Gherkin from mediocre Gherkin — with real examples you can adapt for your own projects.
What Gherkin Actually Is
Gherkin is a domain-specific language designed to express software behavior in plain English (or many other natural languages). It uses a small set of keywords — Feature, Scenario, Given, When, Then, And, But, Background, Scenario Outline, Examples, and Rule — to structure test cases as conversations between a user and a system.
The three core keywords form the backbone of every scenario:
- Given — the precondition or starting state
- When — the action the user takes
- Then — the expected observable outcome
Feature: User Authentication
Scenario: Successful login with valid credentials
Given a registered user with email "alice@example.com" and password "SecurePass123"
When the user submits the login form
Then the user is redirected to the dashboard
And the welcome message displays "Welcome back, Alice"This is clean Gherkin. Now let's dig into what makes it clean and how to keep it that way.
Declarative vs Imperative Scenarios
The single biggest quality distinction in Gherkin is the difference between declarative and imperative scenarios.
Imperative scenarios describe how the user interacts with the system — the clicks, the form fields, the exact UI mechanics:
# Imperative — avoid this style
Scenario: Login
Given the user navigates to "https://app.example.com/login"
When the user clicks on the "Email" field
And the user types "alice@example.com"
And the user clicks on the "Password" field
And the user types "SecurePass123"
And the user clicks the "Sign In" button
Then the URL should be "https://app.example.com/dashboard"Declarative scenarios describe what the user wants to accomplish — the intent, not the mechanics:
# Declarative — prefer this style
Scenario: Login with valid credentials
Given Alice is a registered user
When she logs in with her credentials
Then she sees her personal dashboardDeclarative scenarios are shorter, more readable, and far more resilient to UI changes. When the login page gets redesigned, you change one step definition — not thirty scenarios. The "how" lives in the step implementation, not in the feature file.
A good test: can a product manager or business analyst read your scenario and immediately understand what it's testing? If yes, you're writing declaratively. If they'd need to consult a UI designer to interpret it, you've gone too imperative.
Writing Effective Background Steps
The Background keyword lets you define steps that run before every scenario in a feature file. Use it for shared preconditions — but use it carefully.
Feature: Shopping Cart
Background:
Given the user is logged in as "customer"
And the product catalog contains at least 10 items
Scenario: Adding an item to an empty cart
When the user adds "Wireless Headphones" to the cart
Then the cart displays 1 item
And the cart total shows "$89.99"
Scenario: Adding duplicate items increases quantity
Given the cart already contains 1 "Wireless Headphones"
When the user adds "Wireless Headphones" to the cart again
Then the cart displays 2 items
And the cart total shows "$179.98"Background best practices:
- Keep it short — two or three steps maximum
- Only include context that genuinely applies to all scenarios in the file
- Do not use Background to hide important scenario-specific context
- If you find yourself writing "Given the user is NOT logged in" as a one-off exception, that scenario probably belongs in a different feature file
The Background anti-pattern is stuffing so much into Background that readers must scroll up repeatedly to understand each scenario:
# Anti-pattern: Background that's longer than the scenarios
Background:
Given the system is initialized
And the database is seeded with test data
And the email service is mocked
And the payment gateway is in sandbox mode
And three user accounts exist: "admin", "customer", "guest"
And the feature flag "new-checkout" is enabled
And the user is logged in as "customer"
And the shopping cart is emptyThis background creates hidden dependencies and makes scenarios hard to read in isolation. Move the specific setup into the scenarios that need it.
Scenario Outlines: Testing Multiple Data Combinations
When you need to run the same behavior with different data, Scenario Outline (also called Scenario Template in some implementations) eliminates duplication:
Feature: Password Validation
Scenario Outline: Reject invalid passwords during registration
Given a user is completing registration
When they enter the password "<password>"
Then they see the error "<error_message>"
Examples:
| password | error_message |
| abc | Password must be at least 8 characters |
| password | Password must contain a number |
| 12345678 | Password must contain a letter |
| Password1 | Password must contain a special character|
| P@ssword1 | |
Scenario Outline: Accept valid passwords
Given a user is completing registration
When they enter the password "<password>"
Then registration proceeds without password errors
Examples:
| password |
| P@ssword1! |
| Secur3#Pass |
| MyStr0ng!pwd |Notice the empty error_message in the last row of the invalid passwords table — that's a mistake. Either remove that row or verify it belongs to the valid passwords outline. Data tables should have no ambiguous empty cells.
Tips for Examples tables:
- Use meaningful column headers that read like plain English
- Keep the number of columns small — if you need more than four columns, the outline is probably too complex
- Group related examples in named blocks using
@tags on the Examples section when your tool supports it - Test boundary values: minimum, maximum, just-below-minimum, just-above-maximum
Tags: Organizing Your Test Suite
Tags categorize scenarios so you can run subsets of your suite selectively:
@authentication @smoke @critical
Feature: User Login
@happy-path
Scenario: Successful login
Given a registered user
When they log in with valid credentials
Then they access the dashboard
@edge-case @security
Scenario: Account lockout after failed attempts
Given a registered user
When they enter the wrong password 5 times consecutively
Then the account is locked for 30 minutes
And they receive a security alert email
@regression @slow
Scenario: Login across multiple browser sessions
Given a user is logged in on Chrome
When they also log in on Firefox
Then both sessions are active simultaneouslyTag conventions that work in practice:
| Tag | Purpose |
|---|---|
@smoke |
Fast sanity check — runs in under 5 minutes |
@regression |
Full regression suite |
@critical |
Business-critical paths — failures block release |
@wip |
Work in progress — excluded from CI |
@slow |
Long-running tests — runs nightly, not on every commit |
@security |
Security-specific scenarios |
@flaky |
Known intermittent failures under investigation |
Avoid over-tagging. If every scenario has eight tags, the tags lose their organizational value.
Common Anti-Patterns to Avoid
1. The "And" Chain
Long chains of And steps hide what's actually being tested:
# Anti-pattern
Scenario: Checkout
Given the user is on the checkout page
And the cart has 3 items
And the user has entered their shipping address
And the user has entered their credit card number
And the user has accepted the terms of service
When the user clicks "Place Order"
Then the order confirmation page is shown
And an email is sent to the user
And the inventory is updated
And the payment is processed
And the order appears in the admin panelThis scenario is testing too many things. Split it:
# Better
Scenario: Order confirmation email is sent after successful checkout
Given a user has completed a valid checkout
When the order is placed successfully
Then a confirmation email is sent to the user's registered address
Scenario: Inventory decrements after order placement
Given a product has 10 units in stock
When a customer orders 3 units
Then the product shows 7 units available2. Leaking Implementation Details
# Anti-pattern — exposes database internals
Given the "users" table has a row with id=42, email="alice@example.com", is_active=1
When a GET request is made to "/api/users/42"
Then the response JSON contains "email": "alice@example.com"This ties your test to the database schema and API implementation. A business stakeholder cannot read it. Instead:
# Better
Given Alice is a registered active user
When her profile is retrieved via the API
Then the response includes her email address3. Vague Then Steps
# Anti-pattern — what does "works correctly" mean?
Then the system works correctly
Then the page loads
Then everything is fineAlways be specific about the observable outcome:
Then the order status changes to "Processing"
Then the page title reads "Order Confirmed — #ORD-4821"
Then the user receives an email with subject "Your order has been received"4. One Mega-Scenario
# Anti-pattern — one scenario doing everything
Scenario: Full user journey
Given a new visitor arrives at the homepage
When they click "Sign Up"
And they complete registration
And they verify their email
And they log in
And they add items to the cart
And they complete checkout
And they view their order history
Then the entire flow completes successfullyEnd-to-end journeys have their place, but they should live in a dedicated e2e feature file tagged @e2e @slow, and most of your scenarios should test individual behaviors.
5. Test Data in Scenario Names
# Anti-pattern
Scenario: User alice@example.com with password P@ssw0rd1 can log inScenario names should describe behaviors, not contain data. The data belongs in the steps or Examples table.
Step Reuse and Writing Reusable Steps
Great Gherkin comes from steps that are reusable across scenarios without being so generic they lose meaning.
# Too specific — hard to reuse
Given the user "alice@example.com" with password "SecurePass123" is logged into the application on the main login page
# Too generic — ambiguous
Given the user exists
# Just right
Given Alice is a registered user with standard permissionsAim for steps that sit at the right abstraction level: specific enough to be unambiguous, general enough to appear in multiple scenarios.
When step definitions start multiplying uncontrollably, audit your feature files. If you have 47 slightly different "Given a user is..." steps, consolidate them using parameterization:
Given a user with role "admin" is logged in
Given a user with role "customer" is logged in
Given a user with role "guest" is logged inOne step definition handles all three:
Given('a user with role {string} is logged in', async (role) => {
await loginAs(createUserWithRole(role));
});Putting It All Together
The best way to improve your Gherkin is to read it aloud. If a step sounds unnatural, rewrite it. If a scenario takes more than 30 seconds to explain to a colleague, split it. If a Background section is longer than the scenarios it precedes, something has gone wrong.
Great Gherkin is concise, business-readable, and implementation-agnostic. It focuses on what the system does, not how it does it. It makes test failures immediately understandable and makes new scenarios easy to add.
Invest time in your feature files. They are the documentation that stays up to date because the tests force them to — but only if you keep them clean.