Good Day,
Thank you for applying for the position of Data Engineer at Glints. The following describe the Technical Assessment requirement for this position.
A key part of a Data Engineer’s responsibilities is maintaining the serviceability of Data Warehouse. To achieve this, you will need to understand the set up and operation of a basic Data Warehouse.
In this technical assessment, you are required to submit the setup for the data pipeline of a basic data warehouse using Docker and Apache Airflow.
Your final setup should include the following:
- A source postgres database (Database X)
- A target postgres database (Database Y, which is not the same Docker container as Database X)
- Apache Airflow with webserver accessible from localhost:5884
- A Directed Acyclic Graph (DAG) for transferring the content of Source Database X to Target Database Y
- README.md detailing the usage of your submission
As the focus of this technical assessment is on the set up of a basic data pipeline using Airflow, the content of the table in Source Postgres Database X to be transferred can be determined by the candidate. It can be as basic as:
id | creation_date | sale_value |
---|---|---|
0 | 12-12-21 | 1000 |
1 | 13-12-21 | 2000 |
A public Git repository containing minimally:
- Docker-compose.yml
- README.md explaining your setup, instruction on running your setup and credentials for inspecting the final Target Database Y
- And any required script to run the setup in a Linux environment.
Any required Docker image for your setup should be stored on a public Docker repository. Your setup will be assessed in a Linux environment. The DAG will be triggered manually via Airflow web interface at localhost:5884. Subsequently, the content of Target Database Y will be inspected.
Do your best to keep the duration you spend working on this Assessment to 2 days (1 weekend). When you are done(or should you have any query), email your repository url to [email protected].