Skip to content

Instantly share code, notes, and snippets.

@ducnh1022
Created October 4, 2024 05:17
Show Gist options
  • Save ducnh1022/b619aa0a1229c4224b0b0db781717f19 to your computer and use it in GitHub Desktop.
Save ducnh1022/b619aa0a1229c4224b0b0db781717f19 to your computer and use it in GitHub Desktop.
Here’s the previous solution revisited to make sure we can dynamically split a DBT model's execution into monthly batches using today’s date as the input.
### Scenario Recap:
- **Input today’s date**: Start the script from the current month.
- **Run for the next 12 months**: Split into 12 monthly batches.
- **Pass year and month to both the main model and the referenced model**.
- **Automatically handle date calculations**.
### Full Example Implementation:
#### **Parent Model (`my_batch_model.sql`)**:
```sql
{% set today = modules.datetime.date.today() %}
{% set start_date = today.replace(day=1) %}
-- Loop through the next 12 months, starting from the current month
{% for i in range(0, 12) %}
{% set current_month = (start_date + dateutil.relativedelta.relativedelta(months=i)).month %}
{% set current_year = (start_date + dateutil.relativedelta.relativedelta(months=i)).year %}
-- Call the referenced model with both year and month parameters
with base as (
select *
from {{ ref('my_ref_model', current_year, "{:02d}".format(current_month)) }} -- Pass year and month
)
-- Filter by current year and month in the main model
select *
from base
where extract(year from date_column) = {{ current_year }}
and extract(month from date_column) = {{ current_month }}
{% if not loop.last %}
union all
{% endif %}
{% endfor %}
```
### Breakdown:
1. **Input Date**:
- `today = modules.datetime.date.today()` captures the current date, e.g., `2024-10-04`.
- `start_date = today.replace(day=1)` ensures that the loop starts from the first day of the current month (`2024-10-01`).
2. **Loop for 12 Months**:
- The loop iterates 12 times, each time incrementing the month and potentially the year.
- `dateutil.relativedelta.relativedelta(months=i)` is used to calculate the next months and years dynamically.
3. **Passing Parameters to the Referenced Model**:
- In each iteration, the `current_year` and `current_month` (formatted as `{:02d}`) are passed as parameters to the referenced model.
4. **Union**:
- The results for each month are unioned together using `UNION ALL`.
---
#### **Referenced Model (`my_ref_model.sql`)**:
```sql
-- Fetch the year and month parameters passed from the parent model
{% set year = var('year', 2023) %} -- Default to year 2023 if not provided
{% set month = var('month', '01') %} -- Default to January if not provided
-- Select data for the given year and month
select *
from my_source_table
where extract(year from date_column) = {{ year }}
and extract(month from date_column) = {{ month }}
```
### Key Features:
1. **Dynamic Date Handling**:
- The parent model loops through the months starting from today and automatically calculates the next 12 months.
2. **Passing Parameters**:
- The `year` and `month` are passed dynamically to the referenced model for each iteration. This ensures the correct data is selected for each month and year combination.
3. **Date Filtering**:
- The referenced model filters the data by the `year` and `month` passed from the parent model.
---
### Example Execution:
- If today is `2024-10-04`, this DBT model will run for:
- October 2024 (`2024-10`)
- November 2024 (`2024-11`)
- December 2024 (`2024-12`)
- January 2025 (`2025-01`)
- … up to September 2025 (`2025-09`).
Would this solution meet your needs for dynamically batching model runs based on today’s date?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment