Skip to content

Instantly share code, notes, and snippets.

@mlabouardy
Created April 16, 2019 18:44
Show Gist options
  • Save mlabouardy/bc2af8e5f29e434801883a9ff8d8c6d1 to your computer and use it in GitHub Desktop.
Save mlabouardy/bc2af8e5f29e434801883a9ff8d8c6d1 to your computer and use it in GitHub Desktop.
load and transform csv to bigquery
#!/bin/bash
echo "Download BigQuery Credentials"
aws s3 cp s3://$GCP_AUTH_BUCKET/auth.json .
echo "Upload CSV to GCS"
mkdir -p csv
rm tables
for raw in $(aws s3 ls s3://$S3_BUCKET/ | awk -F " " '{print $2}');
do
table=${raw%/}
if [[ $table != "" && $table != df* ]]
then
echo "Table: $table"
csv=$(aws s3 ls s3://$S3_BUCKET/$table/ | awk -F " " '{print $4}' | grep ^ | sort -r | head -n1)
echo $table >> tables
echo "CSV: $csv"
echo "Copy csv from S3"
aws s3 cp s3://$S3_BUCKET/$table/$csv csv/$table.csv
echo "Upload csv to GCP"
gsutil cp csv/$table.csv gs://$GS_BUCKET/$table.csv
fi
done
echo "Import CSV to BigQuery"
python app.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment