One handy (and scary 👻) thing that Jinja adds on top of SQL is the ability to run for
loops. Here's a rough example that shows how to simplify the statement Frank provided in models__staging__mqis_raw_fiscal_area_shift.sql
using loops:
- Create a
.csv
file with thesid_id
totime_zone
mapping (dim_plant_timezones.csv
) and place it under a/seeds
directory. This mapping can be directly embedded into the model, too, but the seed approach guarantees that the file can be checked into version control and possibly reused across models:
sid_id,time_zone
22,america/denver
23,america/chicago
24,america/new_york
26,america/chicago
27,america/new_york
29,america/chicago
31,america/chicago
34,america/chicago
40,america/new_york
47,america/new_york
60,america/chicago
70,america/chicago
77,america/chicago
80,america/denver
82,america/chicago
84,america/chicago
91,america/new_york
92,america/chicago
96,america/los_angeles
99,america/new_york
114,america/chicago
2045,america/chicago
2048,america/new_york
2049,america/chicago
2050,america/chicago
2051,america/chicago
- Create a sample table in Materialize and insert some data into it:
CREATE TABLE sample_data
(
sid_id text,
name text,
fq_beg_dt timestamp
);
INSERT INTO sample_data VALUES (22,'Plant 1 DNV', now()), (96,'Plant 1 LA', now()), (24,'Plant 1 NY', now());
- Create a model (
test_caseloop.sql
) and place it under/models
. Use the seed to populate the case statement:
{%- set case_data -%}
select sid_id, time_zone
from {{ ref('dim_plant_timezones') }}
{%- endset -%}
{%- set results = run_query(case_data) -%}
{%- if execute -%}
select sid_id,
name,
case
{%- for r in results %}
when sid_id='{{ r["sid_id"] }}' then fq_beg_dt::timestamp at time zone '{{ r["time_zone"] }}'
{%- if loop.last -%}
end as fy_beg_dt
{%- endif -%}
{% endfor %}
from sample_data
{% endif %}
- Run
dbt build
and check the compiled SQL under/target/compiled/.../models/test_caseloop.sql
. It should look like:
select sid_id,
name,
case
when sid_id='22' then fq_beg_dt::timestamp at time zone 'america/denver'
when sid_id='80' then fq_beg_dt::timestamp at time zone 'america/denver'
when sid_id='2049' then fq_beg_dt::timestamp at time zone 'america/chicago'
when sid_id='2050' then fq_beg_dt::timestamp at time zone 'america/chicago'
when sid_id='2051' then fq_beg_dt::timestamp at time zone 'america/chicago'
when sid_id='23' then fq_beg_dt::timestamp at time zone 'america/chicago'
when sid_id='26' then fq_beg_dt::timestamp at time zone 'america/chicago'
when sid_id='29' then fq_beg_dt::timestamp at time zone 'america/chicago'
when sid_id='31' then fq_beg_dt::timestamp at time zone 'america/chicago'
when sid_id='34' then fq_beg_dt::timestamp at time zone 'america/chicago'
when sid_id='60' then fq_beg_dt::timestamp at time zone 'america/chicago'
when sid_id='70' then fq_beg_dt::timestamp at time zone 'america/chicago'
when sid_id='77' then fq_beg_dt::timestamp at time zone 'america/chicago'
when sid_id='82' then fq_beg_dt::timestamp at time zone 'america/chicago'
when sid_id='84' then fq_beg_dt::timestamp at time zone 'america/chicago'
when sid_id='92' then fq_beg_dt::timestamp at time zone 'america/chicago'
when sid_id='114' then fq_beg_dt::timestamp at time zone 'america/chicago'
when sid_id='2045' then fq_beg_dt::timestamp at time zone 'america/chicago'
when sid_id='2048' then fq_beg_dt::timestamp at time zone 'america/new_york'
when sid_id='24' then fq_beg_dt::timestamp at time zone 'america/new_york'
when sid_id='27' then fq_beg_dt::timestamp at time zone 'america/new_york'
when sid_id='40' then fq_beg_dt::timestamp at time zone 'america/new_york'
when sid_id='47' then fq_beg_dt::timestamp at time zone 'america/new_york'
when sid_id='91' then fq_beg_dt::timestamp at time zone 'america/new_york'
when sid_id='99' then fq_beg_dt::timestamp at time zone 'america/new_york'
when sid_id='96' then fq_beg_dt::timestamp at time zone 'america/los_angeles'end as fy_beg_dt
from sample_data
- Verify that the model was created in Materialize, and that the data looks correct:
materialize=> SELECT * FROM test_caseloop;
sid_id | name | fy_beg_dt
--------+-------------+----------------------------
24 | Plant 1 NY | 2023-07-19 06:45:56.488+00
96 | Plant 1 LA | 2023-07-19 09:45:56.488+00
22 | Plant 1 DNV | 2023-07-19 08:45:56.488+00
(3 rows)
Need to run for my flight, but it should be trivial to extend this to also iterate over the different timestamp fields Frank needs to generate (that's...maybe one level of debugging insanity too deep, if you ask me; but possible).