Unit testing your dbt package

Yu Ishikawa
5 min readJul 11, 2022

--

We are all seeing the rapid growth of dbt’s popularity today. The number of users and the size of its community is on the rise. There are various aspects of dbt which attract experts in the data world. I love all of the features of dbt too. Specifically, I love dbt packages, because it makes it easy to extend dbt’s functionality. Moreover, we can reuse dbt packages which others developed, finding great packages in dbt hub. However, it is a bit difficult to develop complicated dbt macros, because dbt on top of Jinja2 doesn’t have unit testing features. In this article, I would like to share best practices to implement unit tests for your dbt packages so that we can provide maintainable and trustable quality extensions.

Basic idea of unit testing dbt packages

Many dbt packages use integration tests as their primary testing methodology. For example dbt-utils has the integration_tests directory so that we can run integration tests by using the generic tests and macros contained within the package. The integration tests directory is a standard dbt project within the dbt-utils package.

To use the integration tests — you’d simply run dbt testlike within the integration_tests directory. The tests execute as normal — meaning you can use your favorite methods of running CI against your dbt project to ensure that your integration tests are passing.

Integration tests can help give you peace of mind that your package is performing as expected — but they have some drawbacks. Macros and generic tests frequently call other macros the deeper calling dependency gets, the more difficult it becomes to debug such macros using only integration tests.

In this scenario it can be helpful to extend the idea of the integration tests for unit tests for your macros. These unit tests can be run with a dbt run-operation. We’ll walk through a quick example of how this can be done.

Let’s consider a dbt package called dbt_sample_package. We would like to implement a simple macro to create a string literal from a string text in a macro named to_literal in the file macros/to_literal.sql.

-- macros/to_literal.sql
{% macro to_literal(text) %}
'{{- text -}}'
{% endmacro %}

To implement a unit testing macro corresponding to the to_literal macro we can create a macro to test our original macro in integration_tests/macros/test_to_literal.sql . Then we call the to_literal macro in the testing macro. and if the result isn’t the same as expected, we raise an error using the expectations.raise_compiler_error macro.

-- integration_tests/macros/test_to_literal.sql
{% macro test_to_literal() %}
{% result = dbt_sample_package.to_literal('test string') %}
{% if result != "'test string'" %}
{{ exceptions.raise_compiler_error('The test is failed') }}
{% endif %}
{% endmacro%}

By doing that, we can call the testing macro in the dbt project of integration tests using dbt run-operation.

dbt run-operation test_to_literal

If we want to run all tests with a single command, it would be good to bundle them in a macro. Moreover, we can call the macro with dbt run-operation .

-- integration_tests/macros/run_unit_tests.sql
{% macro run_unit_tests() %}
{% do test_to_literal() %}{% endmacro %}

Unit tests for multiple adapters

Your dbt package may support multiple adapters. If you are a postgres user, you understand the preceding to_literal macro doesn’t work on postgres because the expression to deal with string literal is different. So, we have to implement a macro to handle a special case of postgres. Now, we implement the subsequent macro called postgres__to_literal in macros/to_literal.sql in addition to the implementation above.

-- macros/to_literal.sql
{% macro to_literal(text) %}
{{ return(adapter.dispatch('literal_string', 'dbt_sample_package')(text)) }}
{% endmacro %}
{% macro default__to_literal(text) %}
'{{- text -}}'
{% endmacro %}
{% macro postgress__to_literal(text) %}
E'{{- text -}}'
{% endmacro %}

You may think of how we can implement unit testing macros efficiently. We can use the the adapter.dispatch macro even in unit testing macros. As we separate the behavior for postgres, we can implement an independent unit testing macro for postgres as well.

-- integration_tests/macros/test_to_literal.sql
{% macro test_to_literal() %}
{{ return(adapter.dispatch('test_to_literal', 'integration_tests')(text)) }}
{% endmacro %}
{% macro default__test_to_literal() %}
{% result = dbt_sample_package.to_literal('test string') %}
{% if result != "'test string'" %}
{{ exceptions.raise_compiler_error('The test is failed') }}
{% endif %}
{% endmacro%}
{% macro postgres__test_to_literal() %}
{% result = dbt_sample_package.to_literal('test string') %}
{% if result != "E'test string'" %}
{{ exceptions.raise_compiler_error('The test is failed') }}
{% endif %}
{% endmacro%}

We can then select unit tests based on the a specified adapter. Let’s assume we have different dbt profiles corresponding to adapters, BigQuery and postgres. By specifying a dbt profile based on the adapter, we can select what testing macros are internally called.

# Run unit tests on BigQuery
# `default__test_to_literal` is internally called.
dbt run-operation run_unit_tests --profile bigquery
# Run unit tests on postgres
# `postgres__test_to_literal` is internally called.
dbt run-operation run_unit_tests --profile postgres

Introduce dbt-unittest

It’s historically been a challenge to do unit testing in your dbt packaging asjinja2 doesn’t offer a built in unit testing feature. But, we have good news. dbt provides the exceptions.raise_compiler_error macro so that we raise errors within a dbt run-operation. Using this, I implemented a dbt package called yu-iskw/dbt-unittest, which is inspired by python’s unittest module, to enhance unit testing of dbt package development.

Using this, we can re-implement the example about using the dbt_unittest.assert_equals macro and the implementation gets much simpler.

-- integration_tests/macros/test_to_literal.sql
{% macro test_to_literal() %}
{{ return(adapter.dispatch('test_to_literal', 'integration_tests')(text)) }}
{% endmacro%}
{% macro default__test_to_literal() %}
{% result = dbt_sample_package.to_literal('test string') %}
{{ dbt_unittest.assert_equals(result, "'test string'") }}
{% endmacro %}
{% macro postgress__test_to_literal() %}
{% result = dbt_sample_package.to_literal('test string') %}
{{ dbt_unittest.assert_equals(result, "E'test string'") }}
{% endmacro %}

I practiced the idea even in the development of yu-iskw/dbt-unittest . The actual testing macros are located here. Moreover, we are able to implement the continuous integration workflow as regular software development. For instance, I implemented a workflow with GitHub Actions. It enables me to notice there is something wrong with changes.

Aside from that, it would be great to take a look at other dbt packages for integration testing and unit testing on dbt hub. For instance, the dbt_datamocktool package is a simple package for unit testing dbt projects. We can create mock CSV seeds to stand in for the sources and refs that your models use and test that the model produces the desired output. That would be useful to mock testing data for your dbt project.

Summary

In this article we’ve:

  • Introduced the two main frameworks for testing your dbt packages
  • Implemented a simple unit testing example
  • Shown how you can use existing tools to help build out your unit testing capabilities

Hopefully this is helpful to you in your dbt package development journey.

--

--

Yu Ishikawa

Data Engineering / Machine Learning / MLOps / Data Governance / Privacy Engineering