dbt doesn't currently support var declarations outside of the dbt_project.yml
file - see relevant issue. This means that multiple users that use vars may run into conflicts or issues as they are all updating the dbt_project.yml
file at the same time.
This is a temporary workaround while the core team works on enabling the above functionality.
- Setup the
dbt_project.yml
file:
# dbt_project.yml
name: my_dbt_project
profile: snowflake
config-version: 2
version: 1.0
models:
my_dbt_project:
+materialized: table
As you can tell from above, it is not necessary to set any vars in dbt_project.yml
for this exercise.
- Let's create a macro that looks up a dictionary and returns the value:
-- macros/cvar.sql
{% macro cvar(var_name) -%}
{%-
set all_project_vars = {
"unit": var("unit", "day"),
"offset": var("offset", "1")
}
-%}
{{ return(all_project_vars[var_name]) }}
{%- endmacro %}
Importantly, all_project_vars
is a dictionary with each key value assigned to an actual dbt var with a default value - this is so that we can override the value at runtime (we'll come to that a bit later) but still have the value be set to something (just like we would in the dbt_project.yml
file).
- Add a toy model to test out the above pattern.
-- models/the_right_time.sql
select
dateadd(
'{{ cvar("unit") }}',
'{{ cvar("offset") }}',
to_timestamp_ntz('1970-01-01')
) as t
- Now, let's try out some runs and see the results:
$ dbt run
00:24:16 On model.my_dbt_project.the_right_time: /* {"app": "dbt", "dbt_version": "1.5.1", "profile_name": "snowflake", "target_name": "default", "node_id": "model.my_dbt_project.the_right_time"} */
create or replace transient table development.dbt_jyeo.the_right_time
as
(select
dateadd(
'day',
'1',
to_timestamp_ntz('1970-01-01')
) as modified_t
);
$ select * from development.dbt_jyeo.the_right_time;
+-------------------------+
| MODIFIED_T |
|-------------------------|
| 1970-01-02 00:00:00.000 |
+-------------------------+
As expected, we didn't have any run time var overrides so we added a day to the epoch date. Now, let's try some run time overrides:
$ dbt run --vars 'offset: 10'
00:27:21 On model.my_dbt_project.the_right_time: /* {"app": "dbt", "dbt_version": "1.5.1", "profile_name": "snowflake", "target_name": "default", "node_id": "model.my_dbt_project.the_right_time"} */
create or replace transient table development.dbt_jyeo.the_right_time
as
(select
dateadd(
'day',
'10',
to_timestamp_ntz('1970-01-01')
) as modified_t
);
$ select * from development.dbt_jyeo.the_right_time;
+-------------------------+
| MODIFIED_T |
|-------------------------|
| 1970-01-11 00:00:00.000 |
+-------------------------+
^ 10 days added as expected as we overrode the offset
var at run time. And finally, overriding both vars at the same time:
$ dbt run --vars '{unit: year, offset: 50}'
00:29:45 On model.my_dbt_project.the_right_time: /* {"app": "dbt", "dbt_version": "1.5.1", "profile_name": "snowflake", "target_name": "default", "node_id": "model.my_dbt_project.the_right_time"} */
create or replace transient table development.dbt_jyeo.the_right_time
as
(select
dateadd(
'year',
'50',
to_timestamp_ntz('1970-01-01')
) as modified_t
);
$ select * from development.dbt_jyeo.the_right_time;
+-------------------------+
| MODIFIED_T |
|-------------------------|
| 2020-01-01 00:00:00.000 |
+-------------------------+
We've successfully added 50 years.
As users want to add new vars to the project, they simply need to modify the macro by adding it to the dictonary and not actually have to touch dbt_project.yml
:
-- macros/cvar.sql
{% macro cvar(var_name) -%}
{%-
set all_project_vars = {
"unit": var("unit", "day"),
"offset": var("offset", "1"),
"some_new_variable": var("some_new_variable", "foo")
}
-%}
{{ return(all_project_vars[var_name]) }}
{%- endmacro %}
Because it's a macro you can not use it in db_project.yml, correct? Hence why we stick with environment variables :(