Skip to content

Instantly share code, notes, and snippets.

@takimo
Last active August 24, 2024 01:59
Show Gist options
  • Select an option

  • Save takimo/92c7788e9cc8bd35fdee078e5f3e1dd5 to your computer and use it in GitHub Desktop.

Select an option

Save takimo/92c7788e9cc8bd35fdee078e5f3e1dd5 to your computer and use it in GitHub Desktop.
dbt Cloudのジョブが生成するcatalog.jsonを利用してdbt-osmosisのrefactorを実行し、伝搬がされていないメタタグ情報の差分をPRで生成してくれるGithubActions
# .github/workflow/dbt-osmosis-refactor.yml
# このワークフローはworkflow_dispatchを使い、mainブランチをベースに処理されるdbt Cloudのジョブの最新の結果をdbt_catalog_downloader.pyで
# JSON取得し、それをtarget/catalog.jsonとして保存した後、dbt-osmosisの処理に渡してメタタグが伝搬していない部分に対して差分が生成され、
# 新たなPRとして生成されます。定期的に実行されることを想定しています
name: Run dbt-osmosis and create PR
# mainブランチにマージされるまでworkflow_dispatchは使えない
# on:
# workflow_dispatch:
# inputs:
# account_id:
# description: 'Account ID for dbt Cloud'
# required: true
# job_definition_id:
# description: 'Job Definition ID for dbt Cloud'
# required: true
on:
push:
branches:
- feature/* # 開発用のブランチ名に合わせる
jobs:
run-dbt-osmosis:
runs-on: ubuntu-latest
env:
account_id: '1111' # 開発時に使用するアカウントIDを直接指定 / workflow_dispatchの時は ${{ github.event.inputs.account_id }}
job_definition_id: '22222' # 開発時に使用するジョブ定義IDを直接指定 / workflow_dispatchの時は ${{ github.event.inputs.job_definition_id }}
dbt_cloud_api_token: ${{ secrets.dbt_cloud_api_token }}
permissions:
contents: write
pull-requests: write
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '3.x'
- name: Install dependencies
run: |
pip install -r requirements.txt
- name: Create dummy profiles.yml
run: |
mkdir -p ~/.dbt
cat <<EOF > ~/.dbt/profiles.yml
default:
outputs:
default:
type: bigquery
method: oauth
project: dummy
schema: dbt_osmosis_ci
threads: 4
target: default
EOF
- name: Run dbt_catalog_downloader.py and save catalog.json
run: |
mkdir -p target # targetディレクトリが存在しない場合に作成
python ./scripts/dbt_catalog_downloader.py > target/catalog.json
- name: Run dbt-osmosis
run: |
dbt deps
dbt-osmosis yaml refactor --skip-add-tags --skip-merge-meta --force-inheritance --catalog-file target/catalog.json
- name: Check for changes
id: check_changes
run: |
git diff --exit-code > /dev/null || echo "changes"
- name: Create a new branch and push changes
if: steps.check_changes.outcome == 'success'
run: |
BRANCH_NAME="dbt-osmosis-refactor-$(date +'%Y%m%d%H%M%S')"
git checkout -b $BRANCH_NAME
git config --global user.name "github-actions"
git config --global user.email "actions@github.com"
git add models
git commit -m "Run dbt-osmosis refactor"
git push origin $BRANCH_NAME
- name: Create Pull Request
if: steps.check_changes.outcome == 'success'
uses: peter-evans/create-pull-request@v4
with:
token: ${{ secrets.GITHUB_TOKEN }}
branch: $BRANCH_NAME
base: main
title: 'dbt-osmosis refactor PR'
body: 'This PR contains the changes made by dbt-osmosis.'
# scripts/dbt_catalog_downloader.py
import os
import requests
# 環境変数から値を取得
account_id = os.environ.get('account_id')
job_definition_id = os.environ.get('job_definition_id')
dbt_cloud_api_token = os.environ.get('dbt_cloud_api_token')
# 最新のジョブ実行結果を取得するためのURLを構築
url = f"https://cloud.getdbt.com/api/v2/accounts/{account_id}/runs/?job_definition_id={job_definition_id}&order_by=-id"
headers = {
'Authorization': f'Bearer {dbt_cloud_api_token}'
}
# APIリクエストを実行してジョブ実行結果を取得
response = requests.get(url, headers=headers)
response_data = response.json()
# 最新のジョブ実行結果からIDを取得
if response_data['status']['is_success']:
latest_run_id = response_data['data'][0]['id']
# catalog.jsonをダウンロードするためのURLを構築
catalog_url = f"https://cloud.getdbt.com/api/v2/accounts/{account_id}/runs/{latest_run_id}/artifacts/catalog.json"
# APIリクエストを実行してcatalog.jsonを取得
catalog_response = requests.get(catalog_url, headers=headers)
print(catalog_response.text) # JSON文字列を出力
else:
print("Failed to retrieve the latest run ID.")
exit(1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment