Test Automation

Testing Haskell IO and Effects with MTL and Polysemy

HelpMeTest

20 May 2026 — 4 min read

Testing effectful Haskell code is where the language's type system earns its reputation. By abstracting over effects — whether through MTL typeclasses or algebraic effect systems like Polysemy — you can run production code with a pure test interpreter. No mocking frameworks, no monkey-patching, just different interpreters for different environments.

The Core Strategy

In Haskell, you make code testable by parameterizing over capabilities rather than hard-coding IO. A function that reads from a database shouldn't take a Connection — it should work over any monad that provides the MonadDatabase capability. In tests, that monad is a pure State-based interpreter. In production, it's the real database.

MTL-Style: Typeclasses for Effects

Define capabilities as typeclasses:

{-# LANGUAGE FlexibleContexts #-}

class Monad m => MonadUserRepo m where
    findUser  :: UserId -> m (Maybe User)
    saveUser  :: User -> m ()
    listUsers :: m [User]

class Monad m => MonadEmailService m where
    sendEmail :: EmailAddress -> Subject -> Body -> m ()

class Monad m => MonadLogger m where
    logInfo :: Text -> m ()
    logError :: Text -> m ()

Business logic uses these constraints:

registerUser
    :: (MonadUserRepo m, MonadEmailService m, MonadLogger m)
    => RegistrationRequest
    -> m (Either RegistrationError User)
registerUser req = do
    logInfo $ "Registering: " <> email req
    existing <- findUser (userId req)
    case existing of
        Just _ -> do
            logError "Duplicate email"
            return $ Left DuplicateEmail
        Nothing -> do
            let user = makeUser req
            saveUser user
            sendEmail (email req) "Welcome!" (welcomeBody user)
            return $ Right user

This function is entirely pure — it has no IO in sight, only constraints.

Production Interpreter

newtype AppM a = AppM { runAppM :: ReaderT AppEnv IO a }
    deriving (Functor, Applicative, Monad, MonadIO, MonadReader AppEnv)

instance MonadUserRepo AppM where
    findUser uid = do
        pool <- asks dbPool
        liftIO $ withConnection pool $ \conn ->
            queryOne conn "SELECT * FROM users WHERE id = ?" [uid]

    saveUser user = do
        pool <- asks dbPool
        liftIO $ withConnection pool $ \conn ->
            execute conn "INSERT INTO users ..." (userFields user)

    listUsers = do
        pool <- asks dbPool
        liftIO $ withConnection pool $ \conn ->
            queryAll conn "SELECT * FROM users"

instance MonadEmailService AppM where
    sendEmail addr subj body = do
        client <- asks smtpClient
        liftIO $ SMTP.send client addr subj body

instance MonadLogger AppM where
    logInfo  msg = liftIO $ putStrLn $ "[INFO]  " <> T.unpack msg
    logError msg = liftIO $ putStrLn $ "[ERROR] " <> T.unpack msg

Test Interpreter

In tests, provide a pure interpreter using StateT:

data TestState = TestState
    { storeUsers  :: Map UserId User
    , sentEmails  :: [(EmailAddress, Subject, Body)]
    , logMessages :: [Text]
    } deriving (Show)

initialTestState :: TestState
initialTestState = TestState Map.empty [] []

newtype TestM a = TestM { runTestM :: State TestState a }
    deriving (Functor, Applicative, Monad, MonadState TestState)

instance MonadUserRepo TestM where
    findUser uid = gets (Map.lookup uid . storeUsers)
    saveUser user = modify $ \s ->
        s { storeUsers = Map.insert (userId user) user (storeUsers s) }
    listUsers = gets (Map.elems . storeUsers)

instance MonadEmailService TestM where
    sendEmail addr subj body = modify $ \s ->
        s { sentEmails = (addr, subj, body) : sentEmails s }

instance MonadLogger TestM where
    logInfo  msg = modify $ \s -> s { logMessages = ("INFO: " <> msg) : logMessages s }
    logError msg = modify $ \s -> s { logMessages = ("ERROR: " <> msg) : logMessages s }

-- Helper to run test code
runTest :: TestM a -> (a, TestState)
runTest action = runState (runTestM action) initialTestState

Testing with the Pure Interpreter

import Test.Hspec

spec :: Spec
spec = describe "registerUser" $ do

    it "creates user and sends welcome email on success" $ do
        let req = RegistrationRequest "alice" "alice@example.com"
        let (result, state) = runTest (registerUser req)
        result `shouldBe` Right (User "alice" "alice@example.com")
        Map.size (storeUsers state) `shouldBe` 1
        length (sentEmails state) `shouldBe` 1
        fst3 (head (sentEmails state)) `shouldBe` "alice@example.com"

    it "returns DuplicateEmail when user exists" $ do
        let req = RegistrationRequest "alice" "alice@example.com"
        let existingUser = User "alice" "alice@example.com"
        let initialState = initialTestState
              { storeUsers = Map.singleton "alice@example.com" existingUser }
        let (result, state) = runState (runTestM (registerUser req)) initialState
        result `shouldBe` Left DuplicateEmail
        length (sentEmails state) `shouldBe` 0  -- no email sent

    it "logs error when duplicate detected" $ do
        let req = RegistrationRequest "alice" "alice@example.com"
        let initialState = initialTestState
              { storeUsers = Map.singleton "alice@example.com" (User "alice" "alice@example.com") }
        let (_, state) = runState (runTestM (registerUser req)) initialState
        any ("ERROR" `T.isPrefixOf`) (logMessages state) `shouldBe` True

No mocking, no stubs, no framework. The test is a pure function.

Polysemy: Algebraic Effects

Polysemy provides a different take on effects using a row-polymorphic effect system:

{-# LANGUAGE DataKinds, GADTs, TemplateHaskell #-}

import Polysemy
import Polysemy.State
import Polysemy.Error

-- Define effects
data UserRepo m a where
    FindUser  :: UserId -> UserRepo m (Maybe User)
    SaveUser  :: User -> UserRepo m ()
    ListUsers :: UserRepo m [User]

makeSem ''UserRepo  -- generates findUser, saveUser, listUsers functions

data EmailService m a where
    SendEmail :: EmailAddress -> Subject -> Body -> EmailService m ()

makeSem ''EmailService

Business Logic with Polysemy

registerUser
    :: Members '[UserRepo, EmailService, Error RegistrationError] r
    => RegistrationRequest
    -> Sem r User
registerUser req = do
    existing <- findUser (reqEmail req)
    case existing of
        Just _  -> throw DuplicateEmail
        Nothing -> do
            let user = makeUser req
            saveUser user
            sendEmail (reqEmail req) "Welcome!" (welcomeBody user)
            return user

Polysemy Interpreters

Production:

runUserRepoDB :: Members '[Embed IO, Reader DBPool] r
              => Sem (UserRepo : r) a
              -> Sem r a
runUserRepoDB = interpret $ \case
    FindUser uid  -> embed . withPool $ \c -> queryOne c uid
    SaveUser user -> embed . withPool $ \c -> insertUser c user
    ListUsers     -> embed . withPool $ \c -> queryAll c

Test (pure, in-memory):

runUserRepoTest :: Sem (UserRepo : r) a
                -> Sem r (Map UserId User, a)
runUserRepoTest = runState Map.empty . reinterpret $ \case
    FindUser uid  -> gets (Map.lookup uid)
    SaveUser user -> modify (Map.insert (userId user) user)
    ListUsers     -> gets Map.elems

Testing Polysemy Code

spec :: Spec
spec = describe "registerUser" $ do

    it "creates user on success" $ do
        let req = RegistrationRequest "alice" "alice@example.com"
        result <- runM
            . runError @RegistrationError
            . runEmailServiceTest
            . fmap fst . runUserRepoTest
            $ registerUser req
        result `shouldBe` Right (User "alice" "alice@example.com")

The interpreter stack is composed right to left, and each layer peels off one effect.

Choosing Between MTL and Polysemy

	MTL	Polysemy
Maturity	Very stable	Actively developed
Performance	Good	Some overhead (GHC optimizes well)
n-effect composition	`O(n²)` instances	`O(1)` with row polymorphism
Testing	`newtype` + instances	Swap interpreters
Learning curve	Moderate	Steeper

For projects with 3-5 effects, MTL is simpler. For larger effect systems, Polysemy's composition advantage becomes meaningful.

The Testing Payoff

Both approaches give you the same property: business logic is a pure function from an initial state to a final state. Tests are fast (no IO, no database), deterministic (no network, no filesystem), and complete (you can inspect every side effect via the final state).

This is Haskell's answer to mocking — not a framework that intercepts calls at runtime, but a type system that makes the behavior-under-test a parameter of the function.