- Prerequisites
- Create the core node
- Create the worker nodes
- Bring up Kubernetes cluster
- Install Airflow with Helm
- You'll need a reasonably high-end windows PC for this. Say 16 core/32 vcpu(thread) with >=64GB ram.
- Ideally a router that you can use to assign dhcp reservations to specific mac addresses.
- Install Hyper-V
- Add an external virtual switch (If you have a second ethernet card, I'd recommend assigning it to that one)
- I use a secondary (D:) drive for my virtual machines, I'll call that path %vroot% (mine is D:\virtualmachine)
- I'll be using ubuntu 24.04 server as the VM distribution. Download the image here
- A way of SSHing into your VMs. I use WSL.
- Anything below contained in %percentsigns% should be substituted for your own information.
- You should probably also have read some of the documentation!
The steps for creating the core and worker nodes are very similar. I'll go through creating the core node, then for the workers, just list things to do differently. I find it very useful to have this core node active at all times for things like testing postgres and docker, outside of when I'm using airflow.
- Open Hyper V
- Create new VM in %vroot% with:
- Your chosen %machinename%
- Your chosen %vroot% location
- 1024GB expandable hard drive
- Generation 2
- 4096MB memory
- External Switch
- Virtual hard disk in %vroot%\%machinename%
- Install operating system later
- Open settings for your new machine:
- Under Processor, change #cpus to 4
- Under Security, change Secure Boot to 'Microsoft UEFI Certificate Authority'
- Under SCSI controller, create a dvd drive then mount the ubuntu iso
- Click Apply (bottom right)
- Under Firmware, change DVD to top of boot order
- Under Checkpoints, disable automatic checkpoints (I disable checkpoints entirely)
- Click Ok
- Connect to and start the VM
- Install ubuntu using defaults except for:
- uncheck LVM in hdd options (use spacebar to toggle the checkbox)
- your name, username, servername and password
- Install openssh server
- Install microk8s
- Reboot
- Log in using your username/password
sudo su
apt install nano net-tools
ifconfig
- Take a note of the %nodeip% address for
eth0
- From WSL copy your public ssh key into the machine:
scp ~/.ssh/id_rsa.pub %username%@%nodeip%:~/.ssh/authorized_keys
- Now is also a good time to reserve an IP address for this machine on your router
- Back in the VM window, do:
chmod 0644 ~/.ssh/authorized_keys
- In WSL, ssh into the VM with
ssh %nodeip%
sudo su
- You can close the other window now, we're now going to use ssh so we can copy/paste
- Disable password for sudo (not recommended but meh)
sudo nano /etc/sudoers
- Change ~third-to-last line (starting
%sudo
) to %sudo ALL=(ALL:ALL) NOPASSWD: ALL
- Save and exit nano
- Change ~third-to-last line (starting
- Disable swap
swapoff -a
nano /etc/fstab
- comment out the last line (starting
/swap.img
) - save and exit nano
- comment out the last line (starting
rm /swap.img
- Install utils
apt update
apt upgrade -y
apt install -y python3-pip python3-virtualenv gnupg2 wget
- Install postgres
sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/postgresql.gpg
apt install -y postgresql-16 postgresql-contrib-16
sudo -u postgres psql postgres
-
create role airflow with superuser login password '%airflowdbpassword%'; create database airflow with owner airflow; \q
-
nano /etc/postgresql/16/main/pg_hba.conf
- Add this line near the bottom
host all all 0.0.0.0/0 scram-sha-256
- THIS IS INSECURE. DO NOT DO THIS IN PRODUCTION
nano /etc/postgresql/16/main/postgresql.conf
- uncomment
listen_addresses = '*'
- uncomment
service postgresql restart
- (Optional) Install samba
- This bit is also not secure but gives you an easy way of transferring files to the VM
apt install samba
mkdir /home/share/
- chown %username%:%username% /home/share/`
chmod 0777 /home/share/
nano /etc/samba/smb.conf
- Add this near the bottom
-
[public] comment = Share Directories browseable = yes writable = yes public = yes create mask = 0775 directory mask = 0775 path = /home/share guest ok = yes
systemctl restart smbd
- You can now mount this share by going to "This PC" in Windows Explorer and clicking "Map network drive"
- Allow the current user to use microk8s without sudo
- sudo usermod -a -G microk8s $USER
- mkdir -p ~/.kube
- chmod 0700 ~/.kube
- Tidy up
apt autoclean
apt autoremove
fstrim /
poweroff
- Now that we've turned off the VM, we can reduce the size of the virtual hard drive by doing:
- Open settings for the VM
- Under SCSI Controller
- Select the hard drive and click edit
- Click through the wizard selecting the option to compact the drive
- Whilst we're here, also under SCSI Controller, remove the dvd drive
Creating worker nodes is much the same as the core node, but with a few steps missing. Don't worry if you want more than 1 worker node, we can just create a template with one of them, then copy it for as many workers as we need. I suggest using a %machinename% of worker1, with %username% and %password% = worker.
So, repeat the above steps again but change:
- (2) Create VM
- Use 1024MB memory (the default)
- (3) VM Settings
- Use 2 CPUs
- (5) Install Ubuntu
- %name%, %username%, %password% = worker
- %machinename% = worker1
- Select the "minimise" option when offered
- (12) Skip the last line (starting
apt install
) - (13) Skip this step (postgres)
- (14) Skip this step (samba)
Additional step, install containerd:
wget https://github.com/containerd/containerd/releases/download/v1.7.16/containerd-1.7.16-linux-amd64.tar.gz
sudo tar Cxzvf /usr/local containerd-1.7.16-linux-amd64.tar.gz
sudo mkdir /etc/containerd
containerd config default > config.toml
sudo cp config.toml /etc/containerd
wget https://raw.githubusercontent.com/containerd/containerd/main/containerd.service
sudo cp containerd.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now containerd
systemctl status containerd
poweroff
To create additional workers:
- Make sure you have shut down the worker1 VM
- Go into %vpath% and make a copy the worker folder, renaming it to something like "worker_template"
- In the Hyper-V console:
- Click "Import Virtual Machine" to start the wizard
- Navigate to the copy of the worker (%vpath%"worker_template)
- In "Choose Import Type" select "make a copy of the virtual machine (create a new unique ID)"
- In "Choose Destination" check "Store the VM in a different location"
- Change all paths to %vpath%/workerX (where X is the number of the worker you're creating)
- Do the same in "Choose Storage Folders"
At this point, to make life easy, I would do the following:
- Copy my private ssh key to the core node
scp ~/.ssh/id_rsa %username%@%ip%:~/.ssh/
- ssh in to the core node
chmod 0600 .ssh/id_rsa
- Bring up a tmux session
sudo apt install tmux
tmux
- Create a console (ctrl-b, ctrl-c) for each of the workers
- For each of those consoles, ssh into the relevant worker
- I can then switch between workers with ctrl-b,ctrl-n
Also, on the core node, set up the following aliases, so we don't have to keep typing in microk8s
alias kubectl='microk8s kubectl'
alias helm='microk8s helm'
And add these to ~/.bash_aliases so they persist to next login
- On the core node, check microk8s is running:
microk8s status --wait-ready
kubectl get nodes
- Let's add a node to the kubernetes cluster!
- On the core node, run
microk8s add-node
- Copy the connection string ending in
--worker
- it should look like
microk8s join 192.168.0.11:25000/018fc2062deaa9409be53babbc29c9d5/26f06c9315a2 --worker
- Switch to the next node to add then paste and run that command
- It might take a minute or so, but hopefully it should say
- "Successfully joined the cluster"
- repeat this step for all of your nodes
- On the core node, run
- Check the cluster is up and running
kubectl get nodes
- Get the Airflow helm chart
mkdir /home/share/airflow
cd /home/share/airflow
helm repo add apache-airflow https://airflow.apache.org
helm repo update
- The helm chart for airflow should now be stored in
- /home/alex/.cache/helm/repository/
- There are a few things we need to override in the default airflow helm chart config
- create a new config file:
nano overrides.yaml
- add the following:
-
postgresql: enabled: false data: metadataConnection: user: airflow pass: %airflowdbpassword% protocol: postgresql host: %corenodeip% port: 5432 db: airflow sslmode: disable
- create a new config file:
- Install airflow on the cluster
helm install airflow apache-airflow/airflow --namespace airflow --create-namespace --debug -f overrides.yaml
- If you get a problem with timeouts, check that your database is running and you can connect to it as the airflow user
- If you can connect to the db from the core node, but not from a worker node or from WSL, then you've missed the pg_hba step.
- Also check that you've given the correct ip address and db password in the overrides.yaml.
- If anything goes wrong, uninstall with
- kubectl delete airflow
- kubectl delete namespace airflow
- and try again.