Data Engineering

dbt Advanced Testing Patterns: Custom Tests, Macros, and CI Integration

HelpMeTest

17 May 2026 — 7 min read

dbt's testing model goes well beyond the four built-in tests. Generic tests written as macros are reusable across your entire project, singular tests handle complex multi-table business rules, and the dbt-expectations package brings 50+ Great Expectations-style assertions into SQL. In CI, dbt test --select state:modified+ ensures only tests relevant to changed models run, keeping pipelines fast.

Key Takeaways

Generic tests are macros, not SQL files. When you write a test as a macro in macros/, it becomes reusable across every model in your project with a single YAML line. Write a test once, apply it everywhere.

Singular tests are the escape hatch for business rules. When a test is too complex for a macro — cross-model joins, multi-column logic, temporal constraints — write it as a standalone .sql file in tests/. dbt treats any query that returns rows as a failure.

Use state:modified+ selectors in CI. Running your full test suite on every PR in a large project takes too long. Slim CI with state:modified+ runs only tests downstream of changed models, cutting CI time from 30 minutes to 3 minutes.

dbt's Testing Philosophy

dbt treats tests as SQL queries. A test fails if the query returns any rows. That single rule is what makes dbt testing elegant: you don't need a framework, a runner, or assertion libraries — you write SQL that selects violations, and dbt interprets results.

This leads to two test types:

Generic tests: Macros that are parameterized and applied to any model via YAML.
Singular tests: One-off SQL files in tests/ that encode a specific business rule.

Built-In Tests: Understanding the Baseline

dbt ships four generic tests:

# models/orders/schema.yml
version: 2
models:
  - name: orders
    columns:
      - name: order_id
        tests:
          - unique
          - not_null
      - name: status
        tests:
          - accepted_values:
              values: ["pending", "processing", "shipped", "delivered", "cancelled"]
      - name: customer_id
        tests:
          - relationships:
              to: ref('customers')
              field: customer_id

Under the hood, unique compiles to:

select order_id, count(*) as n
from {{ model }}
group by 1
having n > 1

Rows returned = test failure. This is the pattern all custom tests follow.

Writing Generic Tests as Macros

Generic tests live in macros/ and follow the naming convention test_<name>.sql.

Example: Not Negative

-- macros/test_not_negative.sql
{% test not_negative(model, column_name) %}

select {{ column_name }}
from {{ model }}
where {{ column_name }} < 0

{% endtest %}

Apply it in YAML:

- name: amount
  tests:
    - not_negative

Example: Mutually Exclusive Status Flags

-- macros/test_mutually_exclusive_flags.sql
{% test mutually_exclusive_flags(model, column_names) %}

select *
from {{ model }}
where (
  {% for col in column_names %}
    case when {{ col }} then 1 else 0 end
    {% if not loop.last %} + {% endif %}
  {% endfor %}
) > 1

{% endtest %}

Apply it:

- name: orders
  tests:
    - mutually_exclusive_flags:
        column_names:
          - is_gift
          - is_subscription
          - is_bulk_order

Example: At Least One Not Null

-- macros/test_at_least_one_not_null.sql
{% test at_least_one_not_null(model, column_names) %}

select *
from {{ model }}
where
  {% for col in column_names %}
    {{ col }} is null
    {% if not loop.last %} and {% endif %}
  {% endfor %}

{% endtest %}

Example: Accepted Range with Configurable Tolerance

-- macros/test_accepted_range.sql
{% test accepted_range(model, column_name, min_value=none, max_value=none, inclusive=true) %}

select {{ column_name }}
from {{ model }}
where 1=1
  {% if min_value is not none %}
    {% if inclusive %}
      and {{ column_name }} < {{ min_value }}
    {% else %}
      and {{ column_name }} <= {{ min_value }}
    {% endif %}
  {% endif %}
  {% if max_value is not none %}
    {% if inclusive %}
      and {{ column_name }} > {{ max_value }}
    {% else %}
      and {{ column_name }} >= {{ max_value }}
    {% endif %}
  {% endif %}

{% endtest %}

- name: discount_pct
  tests:
    - accepted_range:
        min_value: 0
        max_value: 100
        inclusive: true

Singular Tests: Complex Business Rules

Singular tests are .sql files in the tests/ directory. They select violation rows — dbt fails the test if any rows are returned.

Referential Integrity Across Models

-- tests/assert_order_items_have_valid_orders.sql
-- Every order_item must reference an existing order

select oi.order_item_id, oi.order_id
from {{ ref('order_items') }} oi
left join {{ ref('orders') }} o on oi.order_id = o.order_id
where o.order_id is null

Temporal Consistency

-- tests/assert_order_closed_after_created.sql
-- Closed orders must have closed_at >= created_at

select order_id, created_at, closed_at
from {{ ref('orders') }}
where status = 'delivered'
  and closed_at < created_at

Aggregate Consistency

-- tests/assert_order_total_matches_items.sql
-- order.total_amount must equal sum of order_items.line_total

with order_sums as (
    select
        order_id,
        sum(unit_price * quantity) as calculated_total
    from {{ ref('order_items') }}
    group by 1
)
select
    o.order_id,
    o.total_amount,
    s.calculated_total,
    abs(o.total_amount - s.calculated_total) as discrepancy
from {{ ref('orders') }} o
join order_sums s using (order_id)
where abs(o.total_amount - s.calculated_total) > 0.01

Cross-Model Freshness

-- tests/assert_daily_summary_is_current.sql
-- The daily_summary model should contain data for today

select 1
from (
    select max(summary_date) as latest_date
    from {{ ref('daily_summary') }}
) t
where latest_date < current_date - interval '1 day'

The dbt-expectations Package

dbt-expectations brings 50+ Great Expectations-style tests to dbt. Install it:

# packages.yml
packages:
  - package: calogica/dbt_expectations
    version: [">=0.10.0", "<0.11.0"]

dbt deps

Column Type and Format Tests

- name: email
  tests:
    - dbt_expectations.expect_column_values_to_match_regex:
        regex: "^[^@]+@[^@]+\\.[^@]+$"
    - dbt_expectations.expect_column_values_to_not_be_null

- name: phone_number
  tests:
    - dbt_expectations.expect_column_values_to_match_like_pattern:
        like_pattern: "+%-%"

- name: created_at
  tests:
    - dbt_expectations.expect_column_values_to_be_between:
        min_value: "'2020-01-01'::date"
        max_value: "current_date"
        row_condition: "created_at is not null"

Distribution and Volume Tests

- name: orders
  tests:
    - dbt_expectations.expect_table_row_count_to_be_between:
        min_value: 1000
        max_value: 10000000

    - dbt_expectations.expect_table_row_count_to_equal_other_table:
        compare_model: ref('orders_backup')

    - dbt_expectations.expect_column_mean_to_be_between:
        column_name: order_value
        min_value: 50
        max_value: 500

    - dbt_expectations.expect_column_stdev_to_be_between:
        column_name: order_value
        min_value: 0
        max_value: 300

    - dbt_expectations.expect_column_values_to_be_within_n_stdevs:
        column_name: order_value
        n: 5  # flag extreme outliers

Multi-Column Tests

- name: orders
  tests:
    - dbt_expectations.expect_compound_columns_to_be_unique:
        column_list: ["customer_id", "product_id", "created_date"]
        quote_columns: false

Testing Incremental Models

Incremental models are the hardest to test because behavior changes depending on whether dbt is doing a full refresh or an incremental run. Test both modes.

Full Refresh Test

-- tests/assert_incremental_orders_no_duplicates.sql
-- After a full refresh, no order_id should appear twice

{{ config(tags=['incremental']) }}

select order_id, count(*) as n
from {{ ref('orders_incremental') }}
group by 1
having n > 1

Idempotency Test via Macro

A critical property of incremental models: running them twice should produce the same result as running them once. Test this with a custom macro:

-- macros/test_incremental_idempotency.sql
{% test incremental_idempotency(model, key_column, date_column, lookback_days=7) %}

-- Run the incremental model's logic twice for the same window
-- and verify results are identical

with first_run as (
    select {{ key_column }}, count(*) as event_count
    from {{ model }}
    where {{ date_column }} >= current_date - interval '{{ lookback_days }} days'
    group by 1
),

second_run as (
    -- In real CI this would be a separate materialization
    -- Here we self-join to simulate re-processing
    select {{ key_column }}, count(*) as event_count
    from {{ model }}
    where {{ date_column }} >= current_date - interval '{{ lookback_days }} days'
    group by 1
)

select f.{{ key_column }}
from first_run f
full outer join second_run s using ({{ key_column }})
where f.event_count != s.event_count
   or f.{{ key_column }} is null
   or s.{{ key_column }} is null

{% endtest %}

Testing the Merge Strategy

For incremental models using unique_key:

-- models/orders_incremental.sql
{{ config(
    materialized='incremental',
    unique_key='order_id',
    incremental_strategy='merge'
) }}

select
    order_id,
    customer_id,
    status,
    updated_at
from {{ source('raw', 'orders') }}

{% if is_incremental() %}
  where updated_at > (select max(updated_at) from {{ this }})
{% endif %}

Test that updates are applied correctly:

-- tests/assert_incremental_merge_applies_updates.sql
-- If a row exists in raw with a newer updated_at, the incremental
-- model must reflect the latest status

select r.order_id
from {{ source('raw', 'orders') }} r
join {{ ref('orders_incremental') }} i using (order_id)
where r.updated_at > i.updated_at
  and r.status != i.status

Testing ref() Chains

In complex projects, models reference other models 3-4 levels deep. A bug in a foundational model propagates silently downstream. Tag tests to run against the full dependency chain:

# models/marts/revenue/schema.yml
version: 2
models:
  - name: revenue_daily
    tests:
      - dbt_expectations.expect_table_row_count_to_be_between:
          min_value: 1
          max_value: 10000
          config:
            severity: error
            tags: ['critical']
    columns:
      - name: revenue_usd
        tests:
          - not_null
          - not_negative
          - dbt_expectations.expect_column_values_to_be_between:
              min_value: 0
              max_value: 10000000

Run the full downstream chain for a changed model:

dbt test --<span class="hljs-keyword">select orders+  <span class="hljs-comment"># all models and tests downstream of orders
dbt <span class="hljs-built_in">test --<span class="hljs-keyword">select +revenue_daily  <span class="hljs-comment"># all models upstream of revenue_daily

CI Integration with state:modified

Slim CI is the pattern of running only tests affected by changed models. It requires a production manifest artifact.

GitHub Actions with Slim CI

# .github/workflows/dbt-ci.yaml
name: dbt CI

on:
  pull_request:
    paths:
      - "models/**"
      - "tests/**"
      - "macros/**"
      - "dbt_project.yml"

jobs:
  dbt-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.11"

      - name: Install dbt
        run: pip install dbt-bigquery dbt-expectations

      - name: dbt deps
        run: dbt deps
        working-directory: ./transform

      - name: Download production manifest
        run: |
          gsutil cp gs://my-dbt-artifacts/prod/manifest.json ./transform/manifest.json
        env:
          GOOGLE_APPLICATION_CREDENTIALS: ${{ secrets.GCP_SA_KEY }}

      - name: dbt compile (generate CI manifest)
        run: dbt compile --target ci
        working-directory: ./transform
        env:
          DBT_PROFILES_DIR: ./

      - name: dbt run — modified models only
        run: |
          dbt run \
            --select state:modified+ \
            --defer \
            --state ./manifest.json \
            --target ci
        working-directory: ./transform

      - name: dbt test — modified models only
        run: |
          dbt test \
            --select state:modified+ \
            --defer \
            --state ./manifest.json \
            --target ci
        working-directory: ./transform

      - name: Upload test results
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: dbt-test-results
          path: transform/target/run_results.json

The --defer flag is critical: for models not in state:modified+, dbt reads from the production environment instead of rerunning them. This lets downstream tests validate against real production data without rebuilding the entire warehouse.

Profiles for CI

# profiles.yml
my_project:
  target: dev
  outputs:
    dev:
      type: bigquery
      method: service-account
      project: my-project-dev
      dataset: dbt_dev
      threads: 4
    ci:
      type: bigquery
      method: service-account
      project: my-project-dev
      dataset: "dbt_ci_{{ env_var('GITHUB_RUN_ID') }}"  # isolated per PR
      threads: 8
    prod:
      type: bigquery
      method: service-account
      project: my-project-prod
      dataset: dbt_prod
      threads: 16

Test Severity and Warn-vs-Error

Not all test failures should block deployments. Use severity to distinguish:

- name: revenue_usd
  tests:
    - not_null:
        config:
          severity: error      # blocks deployment
    - dbt_expectations.expect_column_values_to_be_between:
        min_value: 0
        max_value: 10000000
        config:
          severity: warn       # logs warning, does not block
          warn_if: ">= 10"    # warn only if 10+ rows violate
          error_if: ">= 100"  # error if 100+ rows violate

Organizing Tests at Scale

For large projects (100+ models), organize tests with tags and config blocks:

# dbt_project.yml
tests:
  my_project:
    marts:
      +tags: ['critical']
      +severity: error
    staging:
      +tags: ['staging']
      +severity: warn

Run tests by tag in CI:

dbt test --<span class="hljs-keyword">select tag:critical          <span class="hljs-comment"># only critical tests
dbt <span class="hljs-built_in">test --exclude tag:staging          <span class="hljs-comment"># skip staging tests in prod check
dbt <span class="hljs-built_in">test --<span class="hljs-keyword">select state:modified+ tag:critical  <span class="hljs-comment"># modified + critical

Debugging Failing Tests

Get the compiled SQL for any failing test:

dbt test --<span class="hljs-keyword">select my_model --store-failures

With --store-failures, dbt materializes the failing rows into a schema called dbt_test__audit. Query it directly:

select * from dbt_test__audit.not_null_orders_order_id limit 100;

This is far faster than re-running the test and trying to reconstruct what violated it.

HelpMeTest can run your data pipeline tests automatically — sign up free