Testing in Haskell with HSpec: BDD-Style Unit Tests

Testing in Haskell with HSpec: BDD-Style Unit Tests

Testing in a purely functional language like Haskell has a reputation for being either trivially easy (pure functions are referentially transparent, so you just call them) or surprisingly complex (dealing with IO, monads, and laziness). The truth is somewhere in the middle. HSpec, inspired by Ruby's RSpec, gives you a clean BDD-style test framework that makes Haskell testing approachable, expressive, and powerful — especially when combined with QuickCheck for property-based testing.

This guide walks through everything you need to write solid Haskell test suites with HSpec, from installation through CI integration.

Why HSpec?

HSpec takes the describe/it block pattern from RSpec and ports it faithfully to Haskell. The result reads almost like natural language:

describe "parseEmail" $ do
  it "accepts valid email addresses" $
    parseEmail "user@example.com" `shouldBe` Right "user@example.com"
  it "rejects addresses without @" $
    parseEmail "notanemail" `shouldBe` Left InvalidFormat

That structure — describe for grouping, it for individual expectations — makes test suites self-documenting. You can read the spec tree in the terminal output and understand exactly what behavior is being verified.

Beyond readability, HSpec integrates cleanly with QuickCheck (for property tests), provides test discovery via hspec-discover, supports parallel execution, and works with both Cabal and Stack.

Installation

With Cabal

Add hspec to your .cabal file under the test suite:

test-suite myproject-test
  type:                exitcode-stdio-1.0
  main-is:             Spec.hs
  other-modules:       ParserSpec
                     , UserSpec
  build-depends:       base >= 4.14
                     , myproject
                     , hspec >= 2.10
                     , hspec-discover >= 2.10
  hs-source-dirs:      test
  default-language:    Haskell2010
  ghc-options:         -threaded -rtsopts -with-rtsopts=-N

With hspec-discover, your Spec.hs entry point is minimal:

-- test/Spec.hs
{-# OPTIONS_GHC -F -pgmF hspec-discover #-}

That single pragma tells GHC to automatically collect all *Spec.hs files and wire them together. No manual imports required.

With Stack

Add to package.yaml:

tests:
  myproject-test:
    main: Spec.hs
    source-dirs: test
    ghc-options:
      - -threaded
      - -rtsopts
      - -with-rtsopts=-N
    dependencies:
      - myproject
      - hspec
      - hspec-discover
      - QuickCheck

Writing Your First Spec

Create test/ParserSpec.hs:

module ParserSpec (spec) where

import Test.Hspec
import Data.Parser (parseAge, parseName, parseUser)

spec :: Spec
spec = do
  describe "parseAge" $ do
    it "parses a valid age" $
      parseAge "25" `shouldBe` Right 25

    it "rejects negative ages" $
      parseAge "-1" `shouldBe` Left "Age must be positive"

    it "rejects non-numeric input" $
      parseAge "abc" `shouldBe` Left "Invalid number"

    it "rejects ages over 150" $
      parseAge "200" `shouldBe` Left "Age too large"

  describe "parseName" $ do
    context "with valid input" $ do
      it "accepts alphabetic names" $
        parseName "Alice" `shouldBe` Right "Alice"

      it "trims leading and trailing whitespace" $
        parseName "  Bob  " `shouldBe` Right "Bob"

    context "with invalid input" $ do
      it "rejects empty strings" $
        parseName "" `shouldBe` Left "Name cannot be empty"

      it "rejects names with numbers" $
        parseName "Alice99" `shouldBe` Left "Name must contain only letters"

Notice the use of context — it's an alias for describe that communicates "under these conditions." This nesting creates a readable specification tree.

Core Matchers

HSpec's expectation functions cover most common assertions:

-- Equality
result `shouldBe` expected
result `shouldNotBe` unexpected

-- Predicates
result `shouldSatisfy` even
result `shouldNotSatisfy` null

-- Lists
[1,2,3] `shouldContain` [2,3]
[1,2,3] `shouldNotContain` [5]
result `shouldMatchList` [3,1,2]  -- order-insensitive

-- Exceptions (IO context)
action `shouldThrow` anyException
action `shouldThrow` anyIOException
action `shouldThrow` (== MyCustomException)

Testing exceptions requires wrapping in IO:

describe "Database.connect" $ do
  it "throws ConnectionRefused when server is down" $ do
    connect "localhost:99999" `shouldThrow` (== ConnectionRefused)

Testing IO Actions

Haskell's type system separates pure and effectful code, but HSpec handles both naturally. Any it block can return an IO ():

import Control.Monad (forM_)
import System.IO.Temp (withSystemTempDirectory)

describe "FileStore" $ do
  it "writes and reads back a file" $ do
    withSystemTempDirectory "hspec-test" $ \dir -> do
      let path = dir <> "/test.txt"
      writeFile path "hello"
      content <- readFile path
      content `shouldBe` "hello"

  it "lists all stored files" $ do
    withSystemTempDirectory "hspec-test" $ \dir -> do
      let files = ["a.txt", "b.txt", "c.txt"]
      forM_ files $ \f -> writeFile (dir <> "/" <> f) ""
      stored <- listStore dir
      stored `shouldMatchList` files

Setup and Teardown

HSpec provides before, after, around, and beforeAll for lifecycle management:

import Database.Connection

spec :: Spec
spec = do
  -- Run once for the entire suite
  beforeAll createTestDatabase $ do
    -- Run before each test
    before (clearTables) $ do
      describe "UserRepository" $ do
        it "inserts a user" $ \conn -> do
          insertUser conn testUser
          found <- findUser conn (userId testUser)
          found `shouldBe` Just testUser

        it "deletes a user" $ \conn -> do
          insertUser conn testUser
          deleteUser conn (userId testUser)
          found <- findUser conn (userId testUser)
          found `shouldBe` Nothing

The around combinator is the most flexible — it wraps each test with setup and teardown in a bracket:

withDatabase :: ActionWith Connection -> IO ()
withDatabase action = do
  conn <- openTestConnection
  action conn `finally` closeConnection conn

spec :: Spec
spec = around withDatabase $ do
  describe "queries" $ do
    it "handles empty results" $ \conn -> do
      results <- runQuery conn "SELECT * FROM empty_table"
      results `shouldBe` []

Shared Examples

HSpec supports reusable example groups, which is excellent for testing typeclass instances or common behaviors across multiple types:

-- Define shared behavior
sharedSerializationSpec :: (Show a, Eq a, Serializable a) => a -> Spec
sharedSerializationSpec value = do
  it "roundtrips through serialize/deserialize" $
    deserialize (serialize value) `shouldBe` Right value

  it "produces non-empty output" $
    serialize value `shouldNotBe` ""

-- Apply to multiple types
spec :: Spec
spec = do
  describe "User serialization" $
    sharedSerializationSpec testUser

  describe "Product serialization" $
    sharedSerializationSpec testProduct

  describe "Order serialization" $
    sharedSerializationSpec testOrder

Combining HSpec with QuickCheck

Property-based testing shines in Haskell. HSpec integrates QuickCheck via the property function:

import Test.Hspec
import Test.Hspec.QuickCheck
import Test.QuickCheck

describe "sort" $ do
  -- Traditional example-based test
  it "sorts a known list" $
    sort [3, 1, 2] `shouldBe` [1, 2, 3]

  -- Property-based test — QuickCheck generates hundreds of inputs
  prop "produces a sorted result" $ \(xs :: [Int]) ->
    let sorted = sort xs
    in isSorted sorted

  prop "preserves all elements" $ \(xs :: [Int]) ->
    sort xs `shouldMatchList` xs

  prop "is idempotent" $ \(xs :: [Int]) ->
    sort (sort xs) == sort xs

isSorted :: Ord a => [a] -> Bool
isSorted []       = True
isSorted [_]      = True
isSorted (x:y:zs) = x <= y && isSorted (y:zs)

The prop function (from Test.Hspec.QuickCheck) runs the property with 100 randomly generated inputs by default. If any fail, QuickCheck shrinks the input to the smallest failing case.

Parallel Spec Execution

For large test suites, parallelism cuts wall-clock time. HSpec supports it with a pragma:

{-# OPTIONS_GHC -F -pgmF hspec-discover -optF --module-name=Spec #-}

And in your tests, mark parallel-safe specs:

import Test.Hspec.Core.Runner (runSpec)
import Test.Hspec.Core.Spec   (parallel)

spec :: Spec
spec = parallel $ do
  describe "pure computations" $ do
    -- These run in parallel safely
    it "test 1" $ ...
    it "test 2" $ ...

Pass --jobs N when running to control parallelism:

cabal test --test-options=<span class="hljs-string">"--jobs 4"

Real Example: Testing a Parser

Let's build a complete spec for an email parser module:

-- src/Data/Email.hs
module Data.Email where

import Data.Char (isAlphaNum)
import Data.List (isPrefixOf)

data Email = Email
  { localPart :: String
  , domain    :: String
  } deriving (Eq, Show)

parseEmail :: String -> Either String Email
parseEmail s =
  case break (== '@') s of
    (_, [])      -> Left "Missing @ symbol"
    ("", _)      -> Left "Empty local part"
    (local, '@':dom)
      | null dom        -> Left "Empty domain"
      | '.' `notElem` dom -> Left "Domain missing dot"
      | not (validLocal local) -> Left "Invalid local part"
      | otherwise       -> Right (Email local dom)
    _            -> Left "Parse error"

validLocal :: String -> Bool
validLocal = all (\c -> isAlphaNum c || c `elem` "._%+-")
-- test/Data/EmailSpec.hs
module Data.EmailSpec (spec) where

import Test.Hspec
import Test.Hspec.QuickCheck
import Test.QuickCheck
import Data.Email

spec :: Spec
spec = do
  describe "parseEmail" $ do

    context "valid email addresses" $ do
      it "parses a standard address" $
        parseEmail "user@example.com"
          `shouldBe` Right (Email "user" "example.com")

      it "parses subdomains" $
        parseEmail "admin@mail.example.co.uk"
          `shouldBe` Right (Email "admin" "mail.example.co.uk")

      it "handles plus-addressing" $
        parseEmail "user+tag@example.com"
          `shouldBe` Right (Email "user+tag" "example.com")

      it "handles dots in local part" $
        parseEmail "first.last@example.com"
          `shouldBe` Right (Email "first.last" "example.com")

    context "invalid email addresses" $ do
      it "rejects missing @" $
        parseEmail "notanemail"
          `shouldBe` Left "Missing @ symbol"

      it "rejects empty local part" $
        parseEmail "@example.com"
          `shouldBe` Left "Empty local part"

      it "rejects empty domain" $
        parseEmail "user@"
          `shouldBe` Left "Empty domain"

      it "rejects domain without dot" $
        parseEmail "user@localhost"
          `shouldBe` Left "Domain missing dot"

    context "properties" $ do
      prop "valid emails always parse successfully" $
        \local dom ->
          let email = validLocalChars local <> "@" <> validDomainStr dom
          in isRight (parseEmail email)

      prop "parsed local part matches input" $
        \s -> case parseEmail s of
          Right (Email loc _) -> '@' `notElem` loc
          Left _              -> True

isRight :: Either a b -> Bool
isRight (Right _) = True
isRight _         = False

-- Generators for valid email parts
validLocalChars :: String -> String
validLocalChars = filter (\c -> isAlphaNum c || c `elem` "._%+-") . take 20

validDomainStr :: String -> String
validDomainStr s =
  let clean = filter isAlphaNum (take 10 s)
  in if null clean then "example.com" else clean <> ".com"

Run it:

cabal test
<span class="hljs-comment"># or
stack <span class="hljs-built_in">test

Output:

Data.Email
  parseEmail
    valid email addresses
      parses a standard address [✓]
      parses subdomains [✓]
      handles plus-addressing [✓]
      handles dots in local part [✓]
    invalid email addresses
      rejects missing @ [✓]
      rejects empty local part [✓]
      rejects empty domain [✓]
      rejects domain without dot [✓]
    properties
      valid emails always parse successfully [✓] (100 tests)
      parsed local part matches input [✓] (100 tests)

Finished in 0.0234 seconds
10 examples, 0 failures

CI with GitHub Actions

# .github/workflows/test.yml
name: Haskell Tests

on:
  push:
    branches: [main]
  pull_request:

jobs:
  test:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Set up Haskell
        uses: haskell-actions/setup@v2
        with:
          ghc-version: '9.6'
          cabal-version: 'latest'

      - name: Cache Cabal packages
        uses: actions/cache@v3
        with:
          path: |
            ~/.cabal/packages
            ~/.cabal/store
            dist-newstyle
          key: ${{ runner.os }}-cabal-${{ hashFiles('**/*.cabal') }}
          restore-keys: |
            ${{ runner.os }}-cabal-

      - name: Install dependencies
        run: cabal build --only-dependencies --enable-tests

      - name: Build
        run: cabal build --enable-tests

      - name: Run tests
        run: cabal test --test-show-details=always

For Stack-based projects, replace the last three steps:

      - name: Set up Haskell (Stack)
        uses: haskell-actions/setup@v2
        with:
          enable-stack: true
          stack-version: 'latest'

      - name: Cache Stack packages
        uses: actions/cache@v3
        with:
          path: ~/.stack
          key: ${{ runner.os }}-stack-${{ hashFiles('**/stack.yaml.lock') }}

      - name: Run tests
        run: stack test --no-terminal

HSpec Discovery Tips

With hspec-discover, any file matching *Spec.hs under your test directory is automatically included. Organize by mirroring your source tree:

src/
  Data/
    Email.hs
    User.hs
  Api/
    Handler.hs

test/
  Data/
    EmailSpec.hs
    UserSpec.hs
  Api/
    HandlerSpec.hs
  Spec.hs       -- just the discover pragma

Discovery also respects module hierarchy — so Data.EmailSpec exports spec :: Spec and hspec-discover wires it in under the Data.Email heading automatically.

Common Pitfalls

Lazy evaluation and exceptions. Haskell's laziness means an exception might not be thrown until a value is forced. If shouldThrow tests are flaky, add a evaluate call to force evaluation:

it "throws on division by zero" $ do
  evaluate (div 1 0) `shouldThrow` anyArithException

Shared mutable state. HSpec runs beforeAll once per suite, but if your test database connection holds state, use IORef or MVar carefully. Prefer before (per-test) over beforeAll (per-suite) unless you're certain tests don't interfere.

QuickCheck default size. By default QuickCheck generates small inputs. For larger structures, adjust the maxSize parameter:

modifyMaxSize (const 200) $ do
  prop "handles large lists" $ \(xs :: [Int]) -> ...

What to Test in Haskell

Given Haskell's strong type system, many bugs are caught at compile time. Focus HSpec on:

  • Business logic in pure functions — parsers, validators, transformers, calculations
  • Edge cases the type system allows — empty lists, zero values, boundary integers
  • IO and effectful code — database queries, file operations, HTTP calls
  • Error pathsLeft values, exceptions, failure modes
  • Roundtrip properties — serialize/deserialize, encode/decode, parse/render

The type system handles structural correctness; your tests handle semantic correctness.

HelpMeTest adds 24/7 monitoring and AI-powered test generation for your Haskell services — start free at helpmetest.com

Read more