April 29, 2026

dbt testsdbt data qualitydbt testingdata build tooldata validation

dbt Tests: Complete Guide to Data Quality Checks

dbt tests are automated data quality assertions that run against the tables and views built by your dbt models. They answer a simple question: "does the data in this model match our expectations?" A test that passes means the assumption holds in the current data; a test that fails surfaces a data quality problem before it silently corrupts downstream dashboards, machine learning features, or financial reports.

TL;DR

dbt tests are SQL SELECT statements that return rows when an assertion fails (zero rows = pass). There are two types: generic tests declared in schema.yml (4 built-in: not_null, unique, accepted_values, relationships) and singular tests — custom SQL files in the tests/ directory. Community packages like dbt_utils and dbt_expectations add 50+ additional test types. Test results feed governance platforms like Dawiso to provide data quality visibility across the catalog.

What Are dbt Tests?

dbt ships with a built-in testing framework that requires no additional infrastructure — tests are just SQL queries that dbt compiles and executes against your warehouse. Every test compiles to a SELECT statement that returns rows when the assertion fails. If the query returns zero rows, the test passes. This approach is powerful because:

Tests run inside your warehouse — no additional compute or tooling required
Tests are code — they live in version control alongside the models they protect
Tests serve as executable documentation — they define what "correct data" looks like
Test failures can block CI/CD pipelines — preventing bad data from reaching production

Without automated tests, data quality regressions are discovered the hard way: an analyst spots an anomaly, a business stakeholder questions a number, or a customer-facing product shows incorrect information. By the time the problem surfaces, bad data may have been in production for days or weeks. dbt tests shift quality left — catching problems at the transformation layer before downstream consumers are affected.

Generic Tests: The 4 Built-in Types

Generic tests are parameterized test definitions that can be applied to any model and column by referencing them in a .yml descriptor file. dbt ships with exactly four built-in generic tests:

not_null

Asserts that a column contains no NULL values. This is the most fundamental test — a column that should always have a value shouldn't silently accept nulls.

unique

Asserts that all values in a column are distinct. Apply this to every primary key column in every model — duplicate primary keys cause silent double-counting in downstream aggregations.

accepted_values

Asserts that every value in a column belongs to a predefined list. Use for status fields, type enumerations, and ISO codes.

relationships

Asserts referential integrity: every value in a column exists in a specified column of another model — a foreign key check. This catches orphaned records when joins would silently drop rows.

Click to enlarge

Singular Tests: Custom SQL Assertions

For business rules that are too specific to generalize, write a singular test — a .sql file in the tests/ directory containing a SELECT that returns the offending rows:

-- tests/assert_orders_amount_positive.sql
-- Every completed order must have a positive amount.
select
    order_id,
    amount
from {{ ref('stg_orders') }}
where status = 'delivered'
  and amount <= 0

-- tests/assert_revenue_matches_order_sum.sql
-- The monthly revenue mart must match the sum of staging orders.
select
    month,
    mart_revenue,
    raw_revenue,
    abs(mart_revenue - raw_revenue) as discrepancy
from (
    select
        date_trunc('month', created_at) as month,
        sum(amount)                     as raw_revenue
    from {{ ref('stg_orders') }}
    where status = 'delivered'
    group by 1
) raw
join (
    select month, revenue as mart_revenue
    from {{ ref('fct_revenue') }}
) mart using (month)
where abs(mart_revenue - raw_revenue) > 0.01

Any row returned by these SELECT statements is a test failure. Singular tests are also a form of executable documentation — they make business rules legible and auditable in code rather than spreadsheets.

Configuring Generic Tests in schema.yml

Generic tests are declared in schema.yml (or any .yml file in the models/ directory) under the tests key for a model or column:

# models/staging/_staging.yml
version: 2

models:
  - name: stg_orders
    description: "Cleaned order records from the transactional database."
    columns:
      - name: order_id
        description: "Surrogate primary key for each order."
        tests:
          - unique
          - not_null

      - name: status
        description: "Current fulfillment status of the order."
        tests:
          - not_null
          - accepted_values:
              values: ['placed', 'shipped', 'delivered', 'cancelled', 'returned']

      - name: customer_id
        description: "Foreign key to stg_customers."
        tests:
          - not_null
          - relationships:
              to: ref('stg_customers')
              field: customer_id

      - name: amount
        description: "Order total in USD."
        tests:
          - not_null

Run all tests with dbt test, or tests for a specific model with dbt test --select stg_orders. Each test compiles to a SQL query like:

-- Compiled not_null test on stg_orders.order_id
select count(*) as failures
from (
    select order_id
    from analytics.staging.stg_orders
    where order_id is null
) dbt_internal_test

Test Severity: warn vs error

By default, a failing test causes dbt test to exit with a non-zero status code, blocking CI pipelines. Tune this with severity:

# models/staging/_staging.yml
models:
  - name: stg_orders
    columns:
      - name: amount
        tests:
          - not_null:
              severity: warn          # warn but do not fail the pipeline
          - dbt_utils.expression_is_true:
              expression: ">= 0"
              severity: error         # block pipeline on negative amounts

Use warn for soft expectations — data quality issues you want to be aware of but that do not necessarily block a deployment (e.g., a new optional field not yet fully populated). Use error (the default) for hard contracts — violations that indicate genuinely broken data.

You can also set a threshold — fail only when the proportion or count of failing rows exceeds a limit, useful for large tables where a small percentage of nulls may be acceptable:

- name: email
  tests:
    - not_null:
        config:
          warn_if: ">10"
          error_if: ">100"

Community Packages: dbt_utils and dbt_expectations

The dbt package ecosystem dramatically extends the built-in test library. Two packages are widely used:

dbt_utils

dbt_utils (maintained by dbt Labs) provides utilities including generic tests not in dbt core:

expression_is_true: Test an arbitrary SQL expression: amount > 0, end_date >= start_date
not_constant: Assert a column has more than one distinct value (catches frozen feeds)
recency: Assert data is fresh within a specified time window
equal_rowcount: Assert two models have the same number of rows
cardinality_equality: Assert two columns have the same set of distinct values

# dbt_utils examples
- name: end_date
  tests:
    - dbt_utils.expression_is_true:
        expression: ">= start_date"

- name: updated_at
  tests:
    - dbt_utils.recency:
        datepart: hour
        field: updated_at
        interval: 25

dbt_expectations

dbt_expectations ports the Great Expectations vocabulary to dbt, offering over 50 test types including:

expect_column_values_to_be_between: Assert values fall within a numeric range
expect_table_row_count_to_be_between: Assert row count bounds (catches empty loads)
expect_column_values_to_match_regex: Validate string format (email addresses, ISO codes)
expect_column_mean_to_be_between: Statistical distribution checks

# dbt_expectations examples
models:
  - name: fct_revenue
    tests:
      - dbt_expectations.expect_table_row_count_to_be_between:
          min_value: 1
          max_value: 5000
    columns:
      - name: revenue
        tests:
          - dbt_expectations.expect_column_values_to_be_between:
              min_value: 0
              max_value: 10000000

Install packages by declaring them in packages.yml and running dbt deps:

# packages.yml
packages:
  - package: dbt-labs/dbt_utils
    version: [">=1.0.0", "<2.0.0"]
  - package: calogica/dbt_expectations
    version: [">=0.10.0", "<1.0.0"]

Test Coverage Strategy

A good test coverage strategy balances thoroughness with maintainability. Not every column needs every test — over-testing creates noise and slows builds without proportional quality gains.

Always Test

Primary keys: unique + not_null on every model's grain key — this is the minimum acceptable test coverage
Foreign keys: relationships tests on all join columns between models
Categorical columns: accepted_values on status fields, type enumerations, and ISO codes
Required business fields: not_null on columns that must never be null by domain rules

Test Strategically

Aggregate consistency: Singular tests comparing row counts or totals between layers (staging vs. mart)
Freshness: dbt_utils.recency on event/transaction tables to catch stale data
Range checks: dbt_expectations.expect_column_values_to_be_between for numeric columns with known bounds

Skip or De-prioritize

Free-text fields with no format expectations
Columns that are genuinely optional by design (sparse features)
Tests that would require full table scans on petabyte-scale tables — consider using --limit or sampling macros

Running Tests in CI/CD

The recommended CI/CD command is dbt build, which runs models and tests in dependency order. If a test fails, dbt stops building downstream models — preventing bad data from propagating further.

# Example GitHub Actions workflow step
- name: dbt build (models + tests)
  run: |
    dbt deps
    dbt build --target ci --select state:modified+
  env:
    DBT_TARGET_SCHEMA: ci_${{ github.run_id }}

Key CI/CD practices for dbt tests:

Slim CI: Use state:modified+ to only run models and tests affected by the current PR — requires storing and comparing the previous manifest.json
Separate CI schema: Run CI builds in an isolated schema so they do not interfere with production or development
Fail fast: Let error-severity test failures block merges to the main branch
Artifact storage: Store run_results.json and manifest.json after each CI run for state comparison in the next run and for governance integrations

Connecting to Data Observability

dbt test results are a primary signal for data observability — the practice of continuously monitoring data pipelines for quality, freshness, and schema drift. Platforms like Dawiso ingest dbt test results (via run_results.json and the manifest) to provide governance-layer visibility into data quality across the entire transformation layer.

Specifically, Dawiso can surface:

Which models have test coverage and which are untested (a governance risk)
Historical test pass/fail trends to identify reliability patterns
Column-level quality scores based on test results
Alerts when tests that previously passed start failing

By connecting dbt test metadata to data catalog entries in Dawiso, data consumers can see not just what a column means (description from schema.yml) but also whether it is currently validated (test coverage) and when it last passed its quality checks — all without leaving the catalog interface. This bridges the gap between the engineering team's dbt test workflow and the governance team's need for evidence that data assets meet quality standards before they are consumed by business-critical processes.

For a complete picture of how test results fit into the broader lineage and metadata picture, see the dbt lineage guide.

Common Testing Anti-Patterns to Avoid

Testing without acting: Tests that always warn but are never investigated create alert fatigue. Every warning should have an owner and a remediation plan.
Not testing primary keys: Skipping unique + not_null on grain keys is the most common oversight — duplicate keys cause silent double-counting in downstream aggregations.
Hardcoding accepted values without syncing to source: accepted_values lists that are not kept in sync with source system enumerations become stale. Consider loading reference data as seeds and using a relationships test instead.
Running full tests in every CI run: On large projects, running all tests on every commit is too slow. Use slim CI with state:modified+ and run the full test suite on a schedule (e.g., nightly).
Skipping tests on staging models: Problems caught at the staging layer are cheaper to fix than problems caught in the mart layer. Always add primary key and not_null tests to staging models.