Skip to content

Instantly share code, notes, and snippets.

View edbizarro's full-sized avatar
🎲
Crunching data

Eduardo Bizarro edbizarro

🎲
Crunching data
View GitHub Profile
#!/usr/bin/env python3
'''Script to autogenerate dbt commands for changed models against a chosen git branch,
with support for fully refreshing models with specific tags.
Usage:
$ python3 dbt_run_changed.py --target_branch master --target dev --commands [run, test] --full_refresh_tags [full_refresh]
Assume model1 and model2 are changed models and model2 is tagged with "full_refresh". The script will generate three dbt commands:
1. dbt run --target dev --model model2 --full-refresh
@eduardorost
eduardorost / merge-schemas.scala
Last active December 19, 2023 09:36
Merge Schema with structs
import org.apache.spark.SparkConf
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.types._
import org.slf4j.{Logger, LoggerFactory}
object Main {
val logger: Logger = LoggerFactory.getLogger(this.getClass)
private lazy val sparkConf: SparkConf = new SparkConf()
.setMaster("local[*]")
@jtalmi
jtalmi / dbt_linter.py
Last active March 15, 2022 20:41
dbt linter -- check for unique/not_null tests and description/columns
#!/usr/bin/env python3
"""
CI script to check:
1. Models have both a unique and not_null test.
2. Models have a description and columns (i.e. a schema.yml entry)
"""
import json
import logging
import os
import subprocess