Skip to content

Instantly share code, notes, and snippets.

@Geremie
Geremie / namespace.yaml
Created October 30, 2020 02:26
Automate your Cloud SQL data synchronization to BigQuery with Airflow
apiVersion: v1
kind: Namespace
metadata:
name: cloud-sql-to-bq
labels:
name: cloud-sql-to-bq
@Geremie
Geremie / pod.yaml
Last active November 1, 2020 16:32
Automate your Cloud SQL data synchronization to BigQuery with Airflow
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
run: cloud-sql-proxy
name: cloud-sql-proxy
namespace: cloud-sql-to-bq
spec:
replicas: 1
selector:
@Geremie
Geremie / service.yaml
Created October 30, 2020 03:15
Automate your Cloud SQL data synchronization to BigQuery with Airflow
kind: Service
apiVersion: v1
metadata:
labels:
run: cloud-sql-proxy-service
name: cloud-sql-proxy-service
namespace: cloud-sql-to-bq
spec:
ports:
- port: 3306
@Geremie
Geremie / cloud_sql_to_bq.py
Last active October 31, 2020 01:41
Automate your Cloud SQL data synchronization to BigQuery with Airflow
import os
from airflow import DAG
from datetime import datetime
from airflow.contrib.operators.mysql_to_gcs import MySqlToGoogleCloudStorageOperator
from airflow.contrib.operators.gcs_to_bq import GoogleCloudStorageToBigQueryOperator
from airflow.contrib.operators.bigquery_operator import BigQueryOperator
CLOUD_SQL_INSTANCE = 'mysql-instance-prod-v1'
@Geremie
Geremie / configure-cloud-shell.sh
Created October 31, 2020 16:07
Automate your Cloud SQL data synchronization to BigQuery with Airflow
gcloud config set project <project_id>
@Geremie
Geremie / enable_composer.sh
Created October 31, 2020 16:21
Automate your Cloud SQL data synchronization to BigQuery with Airflow
gcloud services enable composer.googleapis.com
@Geremie
Geremie / service_account_cloud_composer.sh
Created October 31, 2020 17:23
Automate your Cloud SQL data synchronization to BigQuery with Airflow
gcloud iam service-accounts create cloud-composer --display-name='Cloud Composer service account'
gcloud projects add-iam-policy-binding <project_id> --member=serviceAccount:cloud-composer@<project_id>.iam.gserviceaccount.com --role=roles/composer.worker
gcloud projects add-iam-policy-binding <project_id> --member=serviceAccount:cloud-composer@<project_id>.iam.gserviceaccount.com --role=roles/cloudsql.client
gcloud projects add-iam-policy-binding <project_id> --member=serviceAccount:cloud-composer@<project_id>.iam.gserviceaccount.com --role=roles/bigquery.user
gcloud projects add-iam-policy-binding <project_id> --member=serviceAccount:cloud-composer@<project_id>.iam.gserviceaccount.com --role=roles/bigquery.dataEditor
@Geremie
Geremie / create_composer.sh
Created October 31, 2020 17:25
Automate your Cloud SQL data synchronization to BigQuery with Airflow
gcloud composer environments create data-synchronization-env --location=europe-west1 --zone=europe-west1-b --service-account=cloud-composer@<project_id>.iam.gserviceaccount.com --python-version=3 --enable-ip-alias --enable-private-environment --image-version=composer-1.12.4-airflow-1.10.10
@Geremie
Geremie / enable_service_networking.sh
Last active November 1, 2020 15:33
Automate your Cloud SQL data synchronization to BigQuery with Airflow
gcloud services enable servicenetworking.googleapis.com
@Geremie
Geremie / create_address_range_and_peering_connection.sh
Created October 31, 2020 17:31
Automate your Cloud SQL data synchronization to BigQuery with Airflow
gcloud compute addresses create google-managed-services-default --global --purpose=VPC_PEERING --prefix-length=16 --network=default --project=<project_id>
gcloud services vpc-peerings connect --service=servicenetworking.googleapis.com --ranges=google-managed-services-default --network=default --project=<project_id>