pip install 'apache-airflow[celery]'
sudo apt update
apt install redis-server
- Edit the following file
vi /etc/redis/redis.conf
- Changed the following
"supervised no"
- to
"supervised systemd"
-
save and close the file
-
Restart Redis
sudo systemctl restart redis.service
- Now to check the status of redis
sudo systemctl staus redis.service
-
Configure airflow.cfg
-
Look for Executor and change the executor to Celery Executor
"executor = CeleryExecutor"
- Change the the following in airflow.cfg, dont forget to setup postgresql
"sql_alchemy_conn = postgresql+psycopg2://airflow_user:airflow_pass@localhost/airflow_db"
- Celery parametes to change in the airflow.cfg file
- Pushes the message to the redis server so the line below is the connection to redis instance, and the zero below means the name of the database is 0
"broker_url = redis://localhost:6379/0"
- Change the following parameter stores the metadata every time a task is executed.
"result_backend = db+postgresql://airflow_user:airflow_pass@localhost/airflow_db"
-
Once this is done save and and close the file
-
Now install the redis package on the main console
pip install 'apache-airflow[redis]'
- lookup sql connection
airflow config get-value core sql_alchemy_conn
- Get the executor
airflow config get-value core executor
-update terminal
sudo apt update
sudo apt install postgresql
- hit enter and then click on yes
sudo -u postgres psql
ALTER USER postgres PASSWORD 'postgres';
- exit
\q
- Install the Airflow postgresql package
pip install 'apache-airflow[postgres]'
- Configure Airflow
- Open
"sql_alchemy_conn"
- Change it too
sql_alchemy_conn = postgresql+psycopg2://airflow_user:airflow_pass@localhost/airflow_db
- Lets check if we can reach the database by using
airflow db check
- it show say something like this
[2021-10-17 08:20:05,996] {db.py:783} INFO - Connection successful.
-
Which means we have succesfully configures it
-
Change the executor in airflow.cfg
"executor=CeleryExecutor"
-
Then we stop and start Airflow
-
A configure that could work based on resources
-
parallelism=32
-
dag_concurrency=16
-
max_active_runs_per_dag=16