Skip to content

Instantly share code, notes, and snippets.

@maor-klir
Last active March 27, 2025 14:42
Show Gist options
  • Select an option

  • Save maor-klir/6b0374386b2e323db4ea7749dacdf08f to your computer and use it in GitHub Desktop.

Select an option

Save maor-klir/6b0374386b2e323db4ea7749dacdf08f to your computer and use it in GitHub Desktop.
Talos cluster

01 - Talos Linux - Bootstrapping a Cluster

Using talhelper:

# Create a talconfig.yaml
---
clusterName: talos-cluster
endpoint: https://192.168.0.11:6443
talosVersion: v1.9.2
allowSchedulingOnMasters: true
nodes:
  - machineSpec:
      mode: metal
      arch: amd64
      secureboot: false
    hostname: talos-control-plane-01
    controlPlane: true
    ipAddress: 192.168.0.11
    installDisk: /dev/sda
    nameserver:
      - time.cloudflare.com
    networkInterfaces:
      - interface: eno1
        dhcp: false
        addresses:
        - 192.168.0.11/24
        routes:
          - gateway: 192.168.0.1
#  - machineSpec:
#      mode: metal
#      arch: amd64
#      secureboot: false
#    hostname: talos-worker-01
#    controlPlane: false
#    ipAddress: 192.168.0.12
#    installDisk: /dev/sda
#    nameserver:
#      - time.cloudflare.com
#    networkInterfaces:
#      - interface: eno1
#        dhcp: false
#        addresses:
#        - 192.168.0.12/24
#        routes:
#          - gateway: 192.168.0.1
# Create a file with our future cluster secrets
talhelper gensecret > talsecret.sops.yaml

# Encrypt the secrets within that file
sops -e -i talsecret.sops.yaml

# Generate talosconfig file and machine configs in ./clusterconfig directory
talhelper genconfig

# Confirm that the installation will take place on the correct disk and adjust accordingly
talosctl get disks -n 192.168.0.186 -e 192.168.0.186 --insecure
talosctl get disks -n 192.168.0.123 -e 192.168.0.123 --insecure

# Apply machine configs to nodes
talosctl apply-config --insecure -n 192.168.0.186 --file ./clusterconfig/talos-cluster-control-plane-01.yaml --talosconfig ./clusterconfig/talosconfig

talosctl apply-config --insecure -n 192.168.0.123 --file ./clusterconfig/talos-cluster-worker-01.yaml --talosconfig ./clusterconfig/talosconfig

# Bootstrap the cluster
talosctl bootstrap -n 192.168.0.11 -e 192.168.0.11 --talosconfig ./clusterconfig/talosconfig

# Copy the cluster's kubeconfig to the management machine
talosctl kubeconfig -n 192.168.0.11 -e 192.168.0.11 --talosconfig ./clusterconfig/talosconfig .

02 - selecting Cilium as our CNI

Since we want to use Cilium as our CNI, the following steps need to be done prior bootstrapping the cluster:

# talosconfig.yaml
clusterName: talos-cluster
endpoint: https://192.168.0.11:6443
talosVersion: v1.9.2
nodes:
  - machineSpec:
      mode: metal
      arch: amd64
      secureboot: false
    hostname: talos-control-plane-01
    controlPlane: true
    ipAddress: 192.168.0.11
    installDisk: /dev/sda
    nameserver:
      - time.cloudflare.com
    networkInterfaces:
      - interface: eno1
        dhcp: false
        addresses:
        - 192.168.0.11/24
        routes:
          - gateway: 192.168.0.1
patches:
  - "@./patches/patch.yaml"
#  - machineSpec:
#      mode: metal
#      arch: amd64
#      secureboot: false 
#    hostname: worker-01
#    controlPlane: false
#    ipAddress: 192.168.0.12
#    installDisk: /dev/sdb
#    nameserver:
#      - time.cloudflare.com
#    networkInterfaces:
#      - interface: eno1
#        dhcp: true
#        addresses:
#        - 192.168.0.12/24
#        routes:
#          - gateway: 192.168.0.1
# patch.yaml
cluster:
  allowSchedulingOnControlPlanes: true
  network:
    cni:
      name: none
  proxy:
    disabled: true

Deploying Cilium onto the cluster can be done as follows: (as a Helm chart - the preferred way!)

helm install cilium https://helm.cilium.io/cilium-1.17.1.tgz \
    --namespace kube-system \
    --set ipam.mode=kubernetes \
    --set kubeProxyReplacement=true \
    --set operator.replicas=1 \
    --set securityContext.capabilities.ciliumAgent="{CHOWN,KILL,NET_ADMIN,NET_RAW,IPC_LOCK,SYS_ADMIN,SYS_RESOURCE,DAC_OVERRIDE,FOWNER,SETGID,SETUID}" \
    --set securityContext.capabilities.cleanCiliumState="{NET_ADMIN,SYS_ADMIN,SYS_RESOURCE}" \
    --set cgroup.autoMount.enabled=false \
    --set cgroup.hostRoot=/sys/fs/cgroup \
    --set k8sServiceHost=localhost \
    --set k8sServicePort=7445 \
    --set ingressController.enabled=true \
    --set ingressController.default=true \
    --set ingressController.loadbalancerMode=shared \
# The last 5 arguments can be passed later on in a values.yaml file
# in the correspoding environment (e.g. staging)
#	 --set hubble.relay.enabled=true \ 
#    --set hubble.ui.enabled=true \
#    --set l2announcements.enable=true \
#    --set k8sClientRateLimit.qps=5 \
#    --set k8sClientRateLimit.burst=10

Restart unmananged Pods

https://docs.cilium.io/en/stable/installation/k8s-install-helm/#restart-unmanaged-pods

If you did not create a cluster with the nodes tainted with the taint node.cilium.io/agent-not-ready, then unmanaged pods need to be restarted manually. Restart all already running pods which are not running in host-networking mode to ensure that Cilium starts managing them. This is required to ensure that all pods which have been running before Cilium was deployed have network connectivity provided by Cilium and NetworkPolicy applies to them:

kubectl get pods --all-namespaces -o custom-columns=NAMESPACE:.metadata.namespace,NAME:.metadata.name,HOSTNETWORK:.spec.hostNetwork --no-headers=true | grep '<none>' | awk '{print "-n "$1" "$2}' | xargs -L 1 -r kubectl delete pod

03 - selecting Argo CD as our GitOps tool

Installing Argo CD onto the cluster using Helm:

# Create namespace
kubectl create namespace argocd

# Add the correct repo
helm repo add argo-helm https://argoproj.github.io/argo-helm

# Or alternatively update the Helm repository
helm repo update

# Install the chart and enable Helm support in Kustomize
helm install argocd argo-helm/argo-cd --namespace argocd \
--set version=7.8.14 \
--set configs.cm."kustomize\.buildOptions"="--enable-helm --load-restrictor=LoadRestrictionsNone"
NAME: argocd
LAST DEPLOYED: Wed Mar 26 16:26:57 2025
NAMESPACE: argocd
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
In order to access the server UI you have the following options:

1. kubectl port-forward service/argocd-server -n argocd 8080:443

    and then open the browser on http://localhost:8080 and accept the certificate

2. enable ingress in the values file `server.ingress.enabled` and either
      - Add the annotation for ssl passthrough: https://argo-cd.readthedocs.io/en/stable/operator-manual/ingress/#option-1-ssl-passthrough
      - Set the `configs.params."server.insecure"` in the values file and terminate SSL at your ingress: https://argo-cd.readthedocs.io/en/stable/operator-manual/ingress/#option-2-multiple-ingress-objects-and-hosts


After reaching the UI the first time you can login with username: admin and the random password generated during the installation. You can find the password by running:

kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d

(You should delete the initial secret afterwards as suggested by the Getting Started Guide: https://argo-cd.readthedocs.io/en/stable/getting_started/#4-login-using-the-cli)

Adding an ingress resource

In order to reach the Argo CD UI with a URL, we will add an ingress resource. With the help of Cloudflare private DNS we can reach the FQDN locally on our private network - argocd.cloudandklir.com

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: argocd-ingress
  namespace: argocd
spec:
  ingressClassName: cilium
  rules:
  - host: argocd.cloudandklir.com
    http:
      paths:
      - backend:
          service:
            name: argocd-server
            port:
              number: 443
        path: /
        pathType: Prefix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment