Understanding the scopes of dbt tags

Yu Ishikawa
2 min readNov 5, 2020

--

dbt (data build tool) is really a great tool, as I posted “5 reasons why BigQuery users should use dbt” before. Especially, dbt tags is very useful to select models depending on the situation by taking advantage of model selection syntax. In the article, I describe the scopes of dbt tags that I misunderstood before. And that can be a pitfall for others too.

How can we use dbt tags?

Assume if we have a dbt source with various level tags as below. We can annotate dbt tags to a source, a column and a test respectively.

The dbt CLI provides very useful syntax to select dbt models, tests and so on. If we want to run only dbt schema testswith the tag_z tag, we can pass the syntax to--models option.

$ dbt test --models "tag:tag_z"

What are the scopes of dbt tags by level?

In the beginning, I didn’t understand the scopes of dbt tags correctly. In the case of above, if I execute dbt test --models "tag:tag_x" , the schema testunique in id column is not executed. But, the schema test is executed. It seems that dbt tags has a kind of scope inheritance.

Table-level tags

I call tags like tag_x as table-level tags. The scope affects the schema tests which are not_null and unique, the id column even though the two schema tests doesn’t have tag_x .

$ dbt test --models "tag:tag_x"16:21:46 | 1 of 3START test not_null_sample_gcp_project__sample_dataset__users_id [RUN]
16:21:50 | 2 of 3 START test source_unique_sample_gcp_project__sample_dataset__users_id [RUN]
16:21:55 | 3 of 3 START test not_null_sample_gcp_project__sample_dataset__users_name [RUN]

Column-level tags

I call tags like tag_y as table-level tags. The scope affects the schema tests which are not_null and unique of the id column, even though the two schema tests doesn’t have tag_y .

$ dbt test --models "tag:tag_y"16:21:46 | 1 of 2 START test not_null_sample_gcp_project__sample_dataset__users_id [RUN]
16:21:50 | 2 of 2 START test source_unique_sample_gcp_project__sample_dataset__users_id [RUN]

Test-level tags

This case is clearly intuitive. I call tags like tag_z as table-level tags. The scope affects only the schema test with unique of theid column.

$ dbt test --models "tag:tag_z"16:21:50 | 1 of 1 START test source_unique_sample_gcp_project__sample_dataset__users_id [RUN]

Summary

In the article, I described and introduced the different scopes of dbt tags.

  • Table-level tags affect all schema and data tests under a source.
  • Column-level tags affect all schema tests under a column.
  • Test-level tags affect only a schema test.

--

--

Yu Ishikawa
Yu Ishikawa

Written by Yu Ishikawa

Data Engineering / Machine Learning / MLOps / Data Governance / Privacy Engineering

No responses yet