dbt Testing Guide: Schema Tests, dbt-expectations, and Custom Tests
dbt ships with four built-in tests (not_null, unique, accepted_values, relationships) that cover the basics. For richer assertions—value ranges, regex patterns, statistical checks—add the dbt-expectations package. When neither covers your case, write a custom generic test. This guide covers all three layers plus how to run tests in CI.
Key Takeaways
Built-in tests cover the four most common checks. not_null, unique, accepted_values, and relationships handle the majority of data integrity needs with zero extra packages.
dbt-expectations adds 50+ data quality assertions. expect_column_values_to_be_between, expect_column_mean_to_be_between, expect_column_to_match_regex—these translate Great Expectations patterns into dbt syntax.
Custom generic tests are SQL queries that return failures. Any rows returned by a test query count as failures. This mental model makes writing custom tests straightforward.
Singular tests are SQL files in the tests/ directory. They're one-off checks that don't need to be reusable—useful for complex business logic assertions.
Test severity controls whether failures block the build. Use severity: warn for data quality alerts that shouldn't halt pipeline execution.
Why dbt Tests Matter
Data pipelines are different from application code. A bug in application code raises an exception immediately. A bug in a SQL transformation silently produces wrong numbers that flow into dashboards, reports, and downstream models before anyone notices.
dbt tests are the mechanism to catch these silent failures. They run assertions against your transformed data and fail the pipeline if the data doesn't meet expectations. The result: data quality issues surface at transformation time, not when a stakeholder spots an anomaly in a dashboard.
Built-in dbt Tests
dbt ships with four schema tests that you configure in schema.yml files:
models:
- name: orders
columns:
- name: order_id
tests:
- not_null
- unique
- name: status
tests:
- accepted_values:
values: ['placed', 'shipped', 'returned', 'completed']
- name: customer_id
tests:
- relationships:
to: ref('customers')
field: customer_idnot_null — verifies no nulls in the column. Catches missing data from upstream sources.
unique — verifies no duplicate values. Essential for primary keys and business identifiers.
accepted_values — verifies column values are within a defined set. Catches unexpected status codes, categories, or enum values.
relationships — verifies foreign key integrity between models. Catches joins that would produce unexpected nulls or duplicates.
Run all tests with:
dbt testRun tests for a specific model:
dbt test --<span class="hljs-keyword">select ordersRun only a specific test type:
dbt test --<span class="hljs-keyword">select test_type:not_nulldbt-expectations Package
The dbt-expectations package ports Great Expectations assertions into dbt's testing framework. Install it by adding to packages.yml:
packages:
- package: calogica/dbt_expectations
version: [">=0.10.0", "<0.11.0"]Run dbt deps to install.
Value Range Tests
Check that numeric columns stay within expected bounds:
models:
- name: orders
columns:
- name: order_amount
tests:
- dbt_expectations.expect_column_values_to_be_between:
min_value: 0
max_value: 10000
strictly: falsePattern Matching Tests
Validate string formats with regex:
- name: email
tests:
- dbt_expectations.expect_column_values_to_match_regex:
regex: "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$"Statistical Tests
Catch sudden distribution shifts that might indicate data pipeline issues:
- name: order_amount
tests:
- dbt_expectations.expect_column_mean_to_be_between:
min_value: 50
max_value: 500
- dbt_expectations.expect_column_stdev_to_be_between:
min_value: 10
max_value: 200Row Count Tests
Ensure models have the expected number of rows—useful for catching full table drops or filter bugs:
models:
- name: orders
tests:
- dbt_expectations.expect_table_row_count_to_be_between:
min_value: 1000
max_value: 1000000Value Set Tests
Check that a column contains all expected values (not just a subset):
- name: status
tests:
- dbt_expectations.expect_column_distinct_values_to_equal_set:
value_set: ['placed', 'shipped', 'returned', 'completed']Custom Generic Tests
When built-in tests and dbt-expectations don't cover your case, write a custom generic test. Create a SQL file in the tests/generic/ directory:
-- tests/generic/assert_column_sum_equals.sql
{% test assert_column_sum_equals(model, column_name, expected_sum, tolerance=0) %}
with calculated as (
select sum({{ column_name }}) as actual_sum
from {{ model }}
),
validation as (
select
actual_sum,
{{ expected_sum }} as expected_sum,
abs(actual_sum - {{ expected_sum }}) as difference
from calculated
)
select *
from validation
where difference > {{ tolerance }}
{% endtest %}Use it in schema.yml like any built-in test:
- name: revenue
tests:
- assert_column_sum_equals:
expected_sum: 1000000
tolerance: 5000Generic Test with Multiple Columns
Tests can reference multiple columns. Example: verify that start date is always before end date:
-- tests/generic/assert_column_a_less_than_column_b.sql
{% test assert_column_a_less_than_column_b(model, column_name, column_b) %}
select *
from {{ model }}
where {{ column_name }} >= {{ column_b }}
{% endtest %}Usage:
- name: start_date
tests:
- assert_column_a_less_than_column_b:
column_b: end_dateSingular Tests
Singular tests are SQL files in the tests/ directory (not tests/generic/). They're one-off assertions rather than reusable generic tests. Any rows returned count as failures.
-- tests/assert_revenue_reconciles_with_payments.sql
-- Revenue in orders should match payment totals within 1%
with order_revenue as (
select sum(order_amount) as total_revenue
from {{ ref('orders') }}
where status = 'completed'
),
payment_total as (
select sum(payment_amount) as total_payments
from {{ ref('payments') }}
where payment_status = 'settled'
),
comparison as (
select
o.total_revenue,
p.total_payments,
abs(o.total_revenue - p.total_payments) / o.total_revenue as discrepancy_pct
from order_revenue o
cross join payment_total p
)
select *
from comparison
where discrepancy_pct > 0.01Singular tests are best for complex reconciliation checks, cross-model consistency checks, and business rule validations that are specific to your domain.
Test Severity
By default, failing tests block the pipeline. Use severity to convert failures into warnings instead:
- name: order_amount
tests:
- dbt_expectations.expect_column_mean_to_be_between:
min_value: 50
max_value: 500
severity: warnWith severity: warn, the test failure appears in output but doesn't fail the dbt test command. Use warnings for data quality monitoring that shouldn't block pipelines—things like statistical drift checks where occasional out-of-bound values are expected.
You can also set a failure threshold:
- dbt_expectations.expect_column_values_to_be_between:
min_value: 0
max_value: 10000
config:
severity: warn
warn_if: ">10"
error_if: ">100"Running Tests in CI
A typical CI workflow runs tests after every model build:
# .github/workflows/dbt-ci.yml
name: dbt CI
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install dbt
run: pip install dbt-bigquery # or dbt-snowflake, dbt-postgres, etc.
- name: Install packages
run: dbt deps
working-directory: ./dbt
- name: Build models
run: dbt build --target ci
working-directory: ./dbt
env:
DBT_PROFILES_DIR: .dbt build runs dbt run and dbt test together, so models and tests run in dependency order. If a model fails to build, its downstream tests are skipped.
Test Coverage by Source
Run source freshness checks alongside model tests:
dbt source freshness
dbt build --<span class="hljs-keyword">select <span class="hljs-built_in">source:*+This checks that your raw source tables are being updated within their expected freshness windows before testing models that depend on them.
Organizing Tests by Layer
A practical pattern: apply stricter tests closer to raw data, more business-logic-focused tests on marts.
Staging models — validate source data integrity:
- not_null on primary keys and required fields
- accepted_values on status/type columns
- relationships to verify FK integrity
- regex tests on email, phone, ID formats
Intermediate models — validate transformation logic:
- uniqueness on grain keys
- value ranges that should hold after transformations
- custom tests for business rule enforcement
Mart models — validate business metrics:
- row count bounds (marts shouldn't suddenly shrink)
- statistical range checks on KPI columns
- reconciliation tests against source of truth
Testing with HelpMeTest
For end-to-end validation—verifying that dashboards built on top of your dbt models display correctly—use HelpMeTest alongside dbt tests. HelpMeTest runs browser-based tests against your BI tools and can catch rendering issues, broken filters, and visual anomalies that SQL tests can't detect.
Pair dbt tests (data layer) with HelpMeTest (presentation layer) for full coverage: dbt ensures the numbers are right, HelpMeTest ensures the dashboards show them correctly.
Summary
dbt's testing framework is layered:
- Built-in tests — not_null, unique, accepted_values, relationships for core data integrity
- dbt-expectations — 50+ assertions for ranges, patterns, distributions, and row counts
- Custom generic tests — reusable SQL assertions for your domain-specific rules
- Singular tests — one-off SQL assertions for complex reconciliation checks
Start with built-in tests on every model's primary key and critical columns. Add dbt-expectations for statistical monitoring. Write custom tests for business rules that don't fit the standard patterns. Run everything in CI to catch data quality issues before they reach production.