-- Read about DataTalks.Club Data Engineering Zoomcamp --
Second week of the data engineering Zoomcamp by DataTalks.Club brought a new tool that is one of the most popular data pipeline platforms - Apache Airflow
. So we are going to create some workflows!
First you have to run the Docker compose Airflow installation in the environment of our choice, which can be one of but not limited to MacOS
, Linux
, GCP VM
or very popular WSL
.
What's more, we also need the Google Cloud SDK
installed in our Airflow env in order to connect with the Cloud Store bucket & create tables in Big Query.
That means we cannot just use the official docker-compose.yaml
referenced in the Airflow's docs, but we have to build custom Dockerfile
with an extended apache/airflow
image containing our additional dependencies. Then we can incorporate it into docker-compose.yaml
π