Skip to content

Instantly share code, notes, and snippets.

View jthandy's full-sized avatar

Tristan Handy jthandy

View GitHub Profile
with
gains as (
select date_trunc('month', signup_date) as month, count(*) total_gains
from {{env.schema}}.mailchimp_members
group by date_trunc('month', signup_date)
),
hard_bounces as
(
-- get the first hard bounce date for each email
with email_summary as (
select *
from {{env.schema}}.mailchimp_email_summary
), best_day_of_week as (
select date_part(dow, sent_date), avg(opened::float)
from email_summary
group by 1
with email_summary as (
select *
from {{env.schema}}.mailchimp_email_summary
)
select email_id, sum(clicked)::float / sum(opened)
from email_summary
group by 1
select
md5(
coalesce(date_day::varchar, '') ||
coalesce(destination_url, '') ||
coalesce(utm_medium, '') ||
coalesce(utm_source, '') ||
coalesce(utm_campaign, '') ||
coalesce(utm_term, '') ||
coalesce(utm_content, '') ||
coalesce(ad_group_id::varchar, '') ||
select
ad_performance.*,
sessions.*
from ad_performance
left outer join sessions on ad_performance.id = sessions.ad_performance_id
with discounts as (
select * from {{ref('stripe_discounts')}}
),
invoice_items as (
select * from {{ref('stripe_invoice_items')}}
@jthandy
jthandy / vacuum-analyze-sinter.md
Created August 13, 2018 19:51
Running vacuum and analyze on Redshift via Sinter

Running vacuum and analyze in Sinter

dbt and Sinter have the ability to run regular Redshift maintenance jobs. It's great to set these up early on in a project so that things stay clean as the project grows, and implementing these jobs in Sinter allows the same easy transparency and notifications as with your other dbt jobs.

This document will go through the specific steps necessary to configure vacuum and analyze jobs in the current version of dbt and Sinter. In the future, there will likely be a more idiomatically consistent way to express this logic using native dbt operations. Currently, this does work even if it is not elegant.

Step 1: Create the macros

macros/redshift_maintenance.sql

{% macro source_select_sql(relation, column_name) %}
select {{column_name}} as f from {{relation}}
{% endmacro %}
{% macro single_pass_stats(relation, column_name) %}
{%- call statement('single_pass_column_stats', fetch_result=True) -%}
with source as ( {{ fishtown_internal_analytics.source_select_sql(relation, column_name) }} ),
single_pass_stats as (
@jthandy
jthandy / redshift_flatten_test_data.sql
Created May 14, 2019 00:43
a test dataset for some work with Redshift flattening.
create table dbt_jthandy.flatten_test (
order_id int,
json_text varchar(1000)
)
;
insert into dbt_jthandy.flatten_test
(order_id, json_text)
values
(1, '{
order_id items
1 [{"id":"fa4b6cd3-4719-4b97-848b-7f2025f5e693","quantity":1,"sku":"M900353-SWB-RYL-2","list_price":60.0},{"id":"c39f9474-a278-4162-9cfa-aa068f4e1665","quantity":1,"sku":"F033199-SWB-FWL-1","list_price":20.0}]
2 [{"id":"fa4b6cd3-4719-4b97-848b-7f2025f5e693","quantity":1,"sku":"M900353-SWB-RYL-2","list_price":60.0}]