Dagger.io Testing Guide: Programmatic CI Pipelines That Actually Test

Dagger.io Testing Guide: Programmatic CI Pipelines That Actually Test

Dagger.io flips the CI/CD model on its head: instead of YAML pipelines that are impossible to test locally, you write your pipelines in Python, TypeScript, or Go — and run them anywhere. The same pipeline runs on your laptop, in GitHub Actions, and in GitLab CI without modification.

This guide covers how to structure Dagger pipelines specifically for testing workflows: running unit tests, integration tests, and end-to-end tests as first-class Dagger functions.

What Is Dagger?

Dagger is a programmable CI/CD engine. You define your pipeline as code using Dagger's SDK, and Dagger executes each step in a container. The key insight: the pipeline is portable. No more "works on CI but not locally" debugging sessions.

# Install
pip install dagger-io

# Run locally
dagger call test

Everything executes in containers. Dagger handles caching, parallelism, and artifact passing between steps.

Why Dagger for Testing Pipelines

Traditional CI YAML has a serious problem: you can't unit test your CI configuration. If your GitHub Actions workflow is broken, you find out after a push. With Dagger:

import dagger

@dagger.function
async def test(source: dagger.Directory) -> str:
    """Run all tests and return results."""
    return await (
        dagger.container()
        .from_("python:3.12-slim")
        .with_directory("/src", source)
        .with_workdir("/src")
        .with_exec(["pip", "install", "-e", ".[dev]"])
        .with_exec(["pytest", "--tb=short", "-q"])
        .stdout()
    )

You can call this function locally: dagger call test --source .

Setting Up a Dagger Testing Pipeline

Python SDK

# dagger/src/main.py
import dagger
from dagger import dag, function, object_type

@object_type
class MyPipeline:
    
    @function
    async def unit_test(self, source: dagger.Directory) -> str:
        """Run unit tests with pytest."""
        return await (
            dag.container()
            .from_("python:3.12-slim")
            .with_directory("/src", source, exclude=[".venv", "__pycache__"])
            .with_workdir("/src")
            .with_exec(["pip", "install", "--quiet", "-e", ".[dev]"])
            .with_exec(["pytest", "tests/unit/", "-v", "--tb=short"])
            .stdout()
        )
    
    @function
    async def integration_test(
        self,
        source: dagger.Directory,
        db_url: dagger.Secret,
    ) -> str:
        """Run integration tests against a real database."""
        postgres = (
            dag.container()
            .from_("postgres:16-alpine")
            .with_env_variable("POSTGRES_PASSWORD", "test")
            .with_env_variable("POSTGRES_DB", "testdb")
            .with_exposed_port(5432)
            .as_service()
        )
        
        return await (
            dag.container()
            .from_("python:3.12-slim")
            .with_service_binding("postgres", postgres)
            .with_secret_variable("DATABASE_URL", db_url)
            .with_directory("/src", source)
            .with_workdir("/src")
            .with_exec(["pip", "install", "-e", ".[dev]"])
            .with_exec(["pytest", "tests/integration/", "-v"])
            .stdout()
        )
    
    @function
    async def lint(self, source: dagger.Directory) -> str:
        """Run ruff linting."""
        return await (
            dag.container()
            .from_("python:3.12-slim")
            .with_exec(["pip", "install", "ruff"])
            .with_directory("/src", source)
            .with_workdir("/src")
            .with_exec(["ruff", "check", "."])
            .stdout()
        )
    
    @function
    async def full_check(self, source: dagger.Directory) -> str:
        """Run lint, unit tests, and type checking in parallel."""
        lint_result, test_result, type_result = await asyncio.gather(
            self.lint(source),
            self.unit_test(source),
            self.typecheck(source),
        )
        return f"Lint: OK\nTests: OK\nTypes: OK"

TypeScript SDK

// src/index.ts
import { dag, Container, Directory, object, func } from "@dagger.io/dagger"

@object()
class MyPipeline {
  
  @func()
  async unitTest(source: Directory): Promise<string> {
    return dag
      .container()
      .from("node:22-alpine")
      .withDirectory("/src", source, { exclude: ["node_modules", ".next"] })
      .withWorkdir("/src")
      .withExec(["npm", "ci"])
      .withExec(["npm", "test", "--", "--coverage", "--passWithNoTests"])
      .stdout()
  }
  
  @func()
  async e2eTest(source: Directory): Promise<string> {
    // Spin up the app as a service
    const app = dag
      .container()
      .from("node:22-alpine")
      .withDirectory("/app", source)
      .withWorkdir("/app")
      .withExec(["npm", "ci"])
      .withExec(["npm", "run", "build"])
      .withExec(["npm", "start"])
      .withExposedPort(3000)
      .asService()
    
    // Run Playwright against it
    return dag
      .container()
      .from("mcr.microsoft.com/playwright:v1.50.0-noble")
      .withServiceBinding("app", app)
      .withDirectory("/tests", source.directory("e2e"))
      .withWorkdir("/tests")
      .withExec(["npm", "ci"])
      .withEnvVariable("BASE_URL", "http://app:3000")
      .withExec(["npx", "playwright", "test"])
      .stdout()
  }
}

Running Dagger Pipelines Locally

# Run unit tests
dagger call unit-test --<span class="hljs-built_in">source .

<span class="hljs-comment"># Run with a secret
dagger call integration-test \
  --<span class="hljs-built_in">source . \
  --db-url <span class="hljs-built_in">env:DATABASE_URL

<span class="hljs-comment"># Run the full check
dagger call full-check --<span class="hljs-built_in">source .

This runs identically on your machine and in CI. No more "works locally, breaks in GitHub Actions."

Integrating with GitHub Actions

# .github/workflows/test.yml
name: Test

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Install Dagger CLI
        run: curl -L https://dl.dagger.io/dagger/install.sh | sh
      
      - name: Run tests
        run: dagger call full-check --source .
        env:
          DAGGER_CLOUD_TOKEN: ${{ secrets.DAGGER_CLOUD_TOKEN }}

If you have Dagger Cloud, you get shared caching across runs — your pip install or npm ci only runs once per dependency change.

Parallelism in Dagger Tests

Dagger makes parallel test execution trivial:

import asyncio

@function
async def full_suite(self, source: dagger.Directory) -> dict:
    """Run all test suites in parallel."""
    results = await asyncio.gather(
        self.unit_test(source),
        self.lint(source),
        self.typecheck(source),
        return_exceptions=True,
    )
    
    failures = [r for r in results if isinstance(r, Exception)]
    if failures:
        raise ExceptionGroup("Test failures", failures)
    
    return {
        "unit": results[0],
        "lint": results[1],
        "types": results[2],
    }

All three run simultaneously in separate containers. This cuts CI time by 60-70% compared to sequential execution.

Caching Dependencies

Dagger caches container layers, but you can also explicitly cache directories:

@function
async def test_with_cache(self, source: dagger.Directory) -> str:
    # Cache the pip download cache
    pip_cache = dag.cache_volume("pip-cache")
    
    return await (
        dag.container()
        .from_("python:3.12-slim")
        .with_mounted_cache("/root/.cache/pip", pip_cache)
        .with_directory("/src", source)
        .with_workdir("/src")
        .with_exec(["pip", "install", "-e", ".[dev]"])
        .with_exec(["pytest", "-q"])
        .stdout()
    )

Subsequent runs skip the download entirely if dependencies haven't changed.

Testing Matrix Builds

Run tests against multiple Python/Node versions:

@function
async def matrix_test(self, source: dagger.Directory) -> list[str]:
    """Test against Python 3.10, 3.11, 3.12."""
    versions = ["3.10", "3.11", "3.12"]
    
    results = await asyncio.gather(*[
        dag.container()
        .from_(f"python:{v}-slim")
        .with_directory("/src", source)
        .with_workdir("/src")
        .with_exec(["pip", "install", "-e", ".[dev]"])
        .with_exec(["pytest", "-q"])
        .stdout()
        for v in versions
    ])
    
    return list(results)

End-to-End Testing with HelpMeTest

For UI/browser E2E tests, HelpMeTest integrates directly into your Dagger pipeline via CLI:

@function
async def e2e_test(
    self,
    source: dagger.Directory,
    helpmetest_token: dagger.Secret,
) -> str:
    """Run HelpMeTest E2E suite against the deployed app."""
    return await (
        dag.container()
        .from_("node:22-alpine")
        .with_exec(["npm", "install", "-g", "helpmetest"])
        .with_secret_variable("HELPMETEST_TOKEN", helpmetest_token)
        .with_exec(["helpmetest", "run", "--project", "my-app", "--wait"])
        .stdout()
    )

HelpMeTest runs Robot Framework + Playwright tests in the cloud — no Playwright installation needed in your Dagger container.

Dagger vs. GitHub Actions YAML

Feature Dagger GitHub Actions YAML
Local execution ✅ Same command ❌ Need act workarounds
Type-safe pipeline ✅ Python/TS/Go ❌ Strings and YAML
Reusable functions ✅ Import as module ⚠️ Composite actions only
Container isolation ✅ Always ⚠️ Only with container:
Caching ✅ Automatic layer + volume ⚠️ Manual cache actions
Parallelism asyncio.gather() ⚠️ jobs.needs only
Testing the pipeline ✅ Unit test functions ❌ Not possible

When to Use Dagger

Dagger is the right choice when:

  • You have complex build/test logic that belongs in code, not YAML
  • Your team constantly debugs "works locally, fails in CI"
  • You want to share pipeline logic between projects as modules
  • You need fine-grained caching control

If your pipeline is npm test && npm build, stay with YAML. Dagger's setup cost pays off at moderate pipeline complexity.

Summary

Dagger brings software engineering principles to CI/CD: type-safe, testable, locally executable pipelines. For testing workflows specifically, the benefits are significant:

  1. Local parity — test your tests locally before pushing
  2. Parallelism — run test suites concurrently with asyncio.gather()
  3. Services — spin up Postgres, Redis, or any service as a sidecar
  4. Caching — dependency installs cached automatically
  5. Matrix builds — test against multiple versions without YAML gymnastics

Start with pip install dagger-io and convert your most complex CI step first.

Read more