dbt Advanced Testing Patterns: Custom Tests, Macros, and CI Integration
dbt's testing model goes well beyond the four built-in tests. Generic tests written as macros are reusable across your entire project, singular tests handle complex multi-table business rules, and the dbt-expectations package brings 50+ Great Expectations-style assertions into SQL. In CI, dbt test --select state:modified+ ensures only tests relevant to changed models run, keeping pipelines fast.
Key Takeaways
Generic tests are macros, not SQL files. When you write a test as a macro in macros/, it becomes reusable across every model in your project with a single YAML line. Write a test once, apply it everywhere.
Singular tests are the escape hatch for business rules. When a test is too complex for a macro — cross-model joins, multi-column logic, temporal constraints — write it as a standalone .sql file in tests/. dbt treats any query that returns rows as a failure.
Use state:modified+ selectors in CI. Running your full test suite on every PR in a large project takes too long. Slim CI with state:modified+ runs only tests downstream of changed models, cutting CI time from 30 minutes to 3 minutes.
dbt's Testing Philosophy
dbt treats tests as SQL queries. A test fails if the query returns any rows. That single rule is what makes dbt testing elegant: you don't need a framework, a runner, or assertion libraries — you write SQL that selects violations, and dbt interprets results.
This leads to two test types:
- Generic tests: Macros that are parameterized and applied to any model via YAML.
- Singular tests: One-off SQL files in
tests/that encode a specific business rule.
Built-In Tests: Understanding the Baseline
dbt ships four generic tests:
# models/orders/schema.yml
version: 2
models:
- name: orders
columns:
- name: order_id
tests:
- unique
- not_null
- name: status
tests:
- accepted_values:
values: ["pending", "processing", "shipped", "delivered", "cancelled"]
- name: customer_id
tests:
- relationships:
to: ref('customers')
field: customer_idUnder the hood, unique compiles to:
select order_id, count(*) as n
from {{ model }}
group by 1
having n > 1Rows returned = test failure. This is the pattern all custom tests follow.
Writing Generic Tests as Macros
Generic tests live in macros/ and follow the naming convention test_<name>.sql.
Example: Not Negative
-- macros/test_not_negative.sql
{% test not_negative(model, column_name) %}
select {{ column_name }}
from {{ model }}
where {{ column_name }} < 0
{% endtest %}Apply it in YAML:
- name: amount
tests:
- not_negativeExample: Mutually Exclusive Status Flags
-- macros/test_mutually_exclusive_flags.sql
{% test mutually_exclusive_flags(model, column_names) %}
select *
from {{ model }}
where (
{% for col in column_names %}
case when {{ col }} then 1 else 0 end
{% if not loop.last %} + {% endif %}
{% endfor %}
) > 1
{% endtest %}Apply it:
- name: orders
tests:
- mutually_exclusive_flags:
column_names:
- is_gift
- is_subscription
- is_bulk_orderExample: At Least One Not Null
-- macros/test_at_least_one_not_null.sql
{% test at_least_one_not_null(model, column_names) %}
select *
from {{ model }}
where
{% for col in column_names %}
{{ col }} is null
{% if not loop.last %} and {% endif %}
{% endfor %}
{% endtest %}Example: Accepted Range with Configurable Tolerance
-- macros/test_accepted_range.sql
{% test accepted_range(model, column_name, min_value=none, max_value=none, inclusive=true) %}
select {{ column_name }}
from {{ model }}
where 1=1
{% if min_value is not none %}
{% if inclusive %}
and {{ column_name }} < {{ min_value }}
{% else %}
and {{ column_name }} <= {{ min_value }}
{% endif %}
{% endif %}
{% if max_value is not none %}
{% if inclusive %}
and {{ column_name }} > {{ max_value }}
{% else %}
and {{ column_name }} >= {{ max_value }}
{% endif %}
{% endif %}
{% endtest %}- name: discount_pct
tests:
- accepted_range:
min_value: 0
max_value: 100
inclusive: trueSingular Tests: Complex Business Rules
Singular tests are .sql files in the tests/ directory. They select violation rows — dbt fails the test if any rows are returned.
Referential Integrity Across Models
-- tests/assert_order_items_have_valid_orders.sql
-- Every order_item must reference an existing order
select oi.order_item_id, oi.order_id
from {{ ref('order_items') }} oi
left join {{ ref('orders') }} o on oi.order_id = o.order_id
where o.order_id is nullTemporal Consistency
-- tests/assert_order_closed_after_created.sql
-- Closed orders must have closed_at >= created_at
select order_id, created_at, closed_at
from {{ ref('orders') }}
where status = 'delivered'
and closed_at < created_atAggregate Consistency
-- tests/assert_order_total_matches_items.sql
-- order.total_amount must equal sum of order_items.line_total
with order_sums as (
select
order_id,
sum(unit_price * quantity) as calculated_total
from {{ ref('order_items') }}
group by 1
)
select
o.order_id,
o.total_amount,
s.calculated_total,
abs(o.total_amount - s.calculated_total) as discrepancy
from {{ ref('orders') }} o
join order_sums s using (order_id)
where abs(o.total_amount - s.calculated_total) > 0.01Cross-Model Freshness
-- tests/assert_daily_summary_is_current.sql
-- The daily_summary model should contain data for today
select 1
from (
select max(summary_date) as latest_date
from {{ ref('daily_summary') }}
) t
where latest_date < current_date - interval '1 day'The dbt-expectations Package
dbt-expectations brings 50+ Great Expectations-style tests to dbt. Install it:
# packages.yml
packages:
- package: calogica/dbt_expectations
version: [">=0.10.0", "<0.11.0"]dbt depsColumn Type and Format Tests
- name: email
tests:
- dbt_expectations.expect_column_values_to_match_regex:
regex: "^[^@]+@[^@]+\\.[^@]+$"
- dbt_expectations.expect_column_values_to_not_be_null
- name: phone_number
tests:
- dbt_expectations.expect_column_values_to_match_like_pattern:
like_pattern: "+%-%"
- name: created_at
tests:
- dbt_expectations.expect_column_values_to_be_between:
min_value: "'2020-01-01'::date"
max_value: "current_date"
row_condition: "created_at is not null"Distribution and Volume Tests
- name: orders
tests:
- dbt_expectations.expect_table_row_count_to_be_between:
min_value: 1000
max_value: 10000000
- dbt_expectations.expect_table_row_count_to_equal_other_table:
compare_model: ref('orders_backup')
- dbt_expectations.expect_column_mean_to_be_between:
column_name: order_value
min_value: 50
max_value: 500
- dbt_expectations.expect_column_stdev_to_be_between:
column_name: order_value
min_value: 0
max_value: 300
- dbt_expectations.expect_column_values_to_be_within_n_stdevs:
column_name: order_value
n: 5 # flag extreme outliersMulti-Column Tests
- name: orders
tests:
- dbt_expectations.expect_compound_columns_to_be_unique:
column_list: ["customer_id", "product_id", "created_date"]
quote_columns: falseTesting Incremental Models
Incremental models are the hardest to test because behavior changes depending on whether dbt is doing a full refresh or an incremental run. Test both modes.
Full Refresh Test
-- tests/assert_incremental_orders_no_duplicates.sql
-- After a full refresh, no order_id should appear twice
{{ config(tags=['incremental']) }}
select order_id, count(*) as n
from {{ ref('orders_incremental') }}
group by 1
having n > 1Idempotency Test via Macro
A critical property of incremental models: running them twice should produce the same result as running them once. Test this with a custom macro:
-- macros/test_incremental_idempotency.sql
{% test incremental_idempotency(model, key_column, date_column, lookback_days=7) %}
-- Run the incremental model's logic twice for the same window
-- and verify results are identical
with first_run as (
select {{ key_column }}, count(*) as event_count
from {{ model }}
where {{ date_column }} >= current_date - interval '{{ lookback_days }} days'
group by 1
),
second_run as (
-- In real CI this would be a separate materialization
-- Here we self-join to simulate re-processing
select {{ key_column }}, count(*) as event_count
from {{ model }}
where {{ date_column }} >= current_date - interval '{{ lookback_days }} days'
group by 1
)
select f.{{ key_column }}
from first_run f
full outer join second_run s using ({{ key_column }})
where f.event_count != s.event_count
or f.{{ key_column }} is null
or s.{{ key_column }} is null
{% endtest %}Testing the Merge Strategy
For incremental models using unique_key:
-- models/orders_incremental.sql
{{ config(
materialized='incremental',
unique_key='order_id',
incremental_strategy='merge'
) }}
select
order_id,
customer_id,
status,
updated_at
from {{ source('raw', 'orders') }}
{% if is_incremental() %}
where updated_at > (select max(updated_at) from {{ this }})
{% endif %}Test that updates are applied correctly:
-- tests/assert_incremental_merge_applies_updates.sql
-- If a row exists in raw with a newer updated_at, the incremental
-- model must reflect the latest status
select r.order_id
from {{ source('raw', 'orders') }} r
join {{ ref('orders_incremental') }} i using (order_id)
where r.updated_at > i.updated_at
and r.status != i.statusTesting ref() Chains
In complex projects, models reference other models 3-4 levels deep. A bug in a foundational model propagates silently downstream. Tag tests to run against the full dependency chain:
# models/marts/revenue/schema.yml
version: 2
models:
- name: revenue_daily
tests:
- dbt_expectations.expect_table_row_count_to_be_between:
min_value: 1
max_value: 10000
config:
severity: error
tags: ['critical']
columns:
- name: revenue_usd
tests:
- not_null
- not_negative
- dbt_expectations.expect_column_values_to_be_between:
min_value: 0
max_value: 10000000Run the full downstream chain for a changed model:
dbt test --<span class="hljs-keyword">select orders+ <span class="hljs-comment"># all models and tests downstream of orders
dbt <span class="hljs-built_in">test --<span class="hljs-keyword">select +revenue_daily <span class="hljs-comment"># all models upstream of revenue_dailyCI Integration with state:modified
Slim CI is the pattern of running only tests affected by changed models. It requires a production manifest artifact.
GitHub Actions with Slim CI
# .github/workflows/dbt-ci.yaml
name: dbt CI
on:
pull_request:
paths:
- "models/**"
- "tests/**"
- "macros/**"
- "dbt_project.yml"
jobs:
dbt-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Install dbt
run: pip install dbt-bigquery dbt-expectations
- name: dbt deps
run: dbt deps
working-directory: ./transform
- name: Download production manifest
run: |
gsutil cp gs://my-dbt-artifacts/prod/manifest.json ./transform/manifest.json
env:
GOOGLE_APPLICATION_CREDENTIALS: ${{ secrets.GCP_SA_KEY }}
- name: dbt compile (generate CI manifest)
run: dbt compile --target ci
working-directory: ./transform
env:
DBT_PROFILES_DIR: ./
- name: dbt run — modified models only
run: |
dbt run \
--select state:modified+ \
--defer \
--state ./manifest.json \
--target ci
working-directory: ./transform
- name: dbt test — modified models only
run: |
dbt test \
--select state:modified+ \
--defer \
--state ./manifest.json \
--target ci
working-directory: ./transform
- name: Upload test results
if: always()
uses: actions/upload-artifact@v4
with:
name: dbt-test-results
path: transform/target/run_results.jsonThe --defer flag is critical: for models not in state:modified+, dbt reads from the production environment instead of rerunning them. This lets downstream tests validate against real production data without rebuilding the entire warehouse.
Profiles for CI
# profiles.yml
my_project:
target: dev
outputs:
dev:
type: bigquery
method: service-account
project: my-project-dev
dataset: dbt_dev
threads: 4
ci:
type: bigquery
method: service-account
project: my-project-dev
dataset: "dbt_ci_{{ env_var('GITHUB_RUN_ID') }}" # isolated per PR
threads: 8
prod:
type: bigquery
method: service-account
project: my-project-prod
dataset: dbt_prod
threads: 16Test Severity and Warn-vs-Error
Not all test failures should block deployments. Use severity to distinguish:
- name: revenue_usd
tests:
- not_null:
config:
severity: error # blocks deployment
- dbt_expectations.expect_column_values_to_be_between:
min_value: 0
max_value: 10000000
config:
severity: warn # logs warning, does not block
warn_if: ">= 10" # warn only if 10+ rows violate
error_if: ">= 100" # error if 100+ rows violateOrganizing Tests at Scale
For large projects (100+ models), organize tests with tags and config blocks:
# dbt_project.yml
tests:
my_project:
marts:
+tags: ['critical']
+severity: error
staging:
+tags: ['staging']
+severity: warnRun tests by tag in CI:
dbt test --<span class="hljs-keyword">select tag:critical <span class="hljs-comment"># only critical tests
dbt <span class="hljs-built_in">test --exclude tag:staging <span class="hljs-comment"># skip staging tests in prod check
dbt <span class="hljs-built_in">test --<span class="hljs-keyword">select state:modified+ tag:critical <span class="hljs-comment"># modified + criticalDebugging Failing Tests
Get the compiled SQL for any failing test:
dbt test --<span class="hljs-keyword">select my_model --store-failuresWith --store-failures, dbt materializes the failing rows into a schema called dbt_test__audit. Query it directly:
select * from dbt_test__audit.not_null_orders_order_id limit 100;This is far faster than re-running the test and trying to reconstruct what violated it.
HelpMeTest can run your data pipeline tests automatically — sign up free