Skip to content

Instantly share code, notes, and snippets.

@clemenko
Last active September 16, 2025 04:45
Show Gist options
  • Save clemenko/251a90a28e6a8bbc8be9427480babb3a to your computer and use it in GitHub Desktop.
Save clemenko/251a90a28e6a8bbc8be9427480babb3a to your computer and use it in GitHub Desktop.

setting up Rancher with certs - example

Docs : https://ranchermanager.docs.rancher.com/getting-started/installation-and-upgrade/resources/add-tls-secrets

install rke2

curl -sfL https://get.rke2.io |  sh -

set up env

echo "export KUBECONFIG=/etc/rancher/rke2/rke2.yaml PATH=$PATH:/usr/local/bin/:/var/lib/rancher/rke2/bin/" >> ~/.bashrc
source ~/.bashrc

install helm

curl -s https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

add repos

helm repo add rancher-latest https://releases.rancher.com/server-charts/latest --force-update 
helm repo add jetstack https://charts.jetstack.io --force-update 

install cert-manager

helm upgrade -i cert-manager jetstack/cert-manager -n cert-manager --create-namespace --set crds.enabled=true 

add secrets

kubectl create ns cattle-system

kubectl -n cattle-system create secret tls tls-rancher-ingress --cert=/root/star.rfed.io.cert --key=/root/star.rfed.io.key

kubectl -n cattle-system create secret generic tls-ca --from-file=/root/cacerts.pem 

install rancher with tls certs

helm upgrade -i rancher rancher-latest/rancher -n cattle-system --create-namespace --set hostname=rancher.rfed.io --set bootstrapPassword=bootStrapAllTheThings --set replicas=1 --set ingress.tls.source=secret --set ingress.tls.secretName=tls-rancher-ingress --set privateCA=true 
@clemenko
Copy link
Author

clemenko commented Nov 2, 2024

huh... what version of rke2 and kubectl are you using?

@clemenko
Copy link
Author

clemenko commented Nov 2, 2024

have you updated helm as well?

@me1iissa
Copy link

me1iissa commented Nov 2, 2024

Pulled from rancher manager
Provider: RKE2 Kubernetes Version: v1.30.5 +rke2r1 Architecture: Amd64 Created: 1.1 hours ago

kubectl
root@rancher-uk-01:~# kubectl version Client Version: v1.30.5+rke2r1 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.30.5+rke2r1

I did not manually update helm after installing it using curl -s https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

Helm Version
root@rancher-uk-01:~# helm version version.BuildInfo{Version:"v3.16.2", GitCommit:"13654a52f7c70a143b1dd51416d633e1071faffb", GitTreeState:"clean", GoVersion:"go1.22.7"}

@clemenko
Copy link
Author

clemenko commented Nov 2, 2024

that looks all current. odd. can you recreate it easily?

@me1iissa
Copy link

me1iissa commented Nov 2, 2024

Sure, let me spin up a new VM and I will follow the steps again.

@me1iissa
Copy link

me1iissa commented Nov 2, 2024

So after creating 2 new VM's I can say that my initial issue was some odd fluke/random issue.

It works fine on both fresh VM with /root/cacerts.pem after copying the same commands from the first node
Weird, not sure what happened with the previous node considering I am using the same commands

However the file has to be named cacerts.pem if it is anything else it will error out with failed to setup TLS listener: read /etc/rancher/ssl/cacerts.pem: is a directory like before. (This was further testing)

Thanks for the help.

@clemenko
Copy link
Author

clemenko commented Nov 2, 2024

glad it worked out. I have seen similar oddities.

@toti18
Copy link

toti18 commented Dec 4, 2024

Hi Andy, how are you doing? Hope you're good, man :-)

Andy, the guys deployd a new root CA here at the company, so I'll have to update the certificate on my Rancher environment, and I taking a look on how to do it.
As this is the first time I'm doing that, I'm looking into the documentation (https://ranchermanager.docs.rancher.com/v2.8/getting-started/installation-and-upgrade/resources/update-rancher-certificate), and also found your video on YouTube as well (thanks for that, really helpful, Andy, not to mention the funny moment you tell you forgot the cert files haha)

One question that I have is when I get to that part related to update the CA certificate secret object (cacerts.pem)...
I've got the new root CA .pfx file, so in order to have the cacerts.pem I'll have to convert the .pfx file into the .pem format.
I believe that the command below (with the "-nodes" parameter) is the one I need to do that conversion, right?
openssl pkcs12 -in mycert.pfx -out cacerts.pem -nodes

If you could tell how you got your cacerts.pem file, I would really appreciate it.

Best regards,
toti

@clemenko
Copy link
Author

clemenko commented Dec 4, 2024

Hey toti! Things are good. Glad the guides are helpful. According to stackoverflow : https://stackoverflow.com/questions/15413646/converting-pfx-to-pem-using-openssl Yes it does look like that command should work!
I get my cacerts.pem from my Cert provider. It helps that I do almost everything in linux.

@toti18
Copy link

toti18 commented Dec 4, 2024

Oh great Andy, thank you very much!
Don't have so much time today, but will gonna try tomorrow.
+1 like/subscribed on your YT channel
See ya!

@clemenko
Copy link
Author

clemenko commented Dec 4, 2024

Awesome thank you! Let me know how it goes.

@andisugandi
Copy link

andisugandi commented Dec 15, 2024

Hi @clemenko ,

Thank you very much for the guidance & video. Again, those are very helpful.

I have followed your tutorial with a bit customization in the SSL certificate (using letsEncrypt):

helm install rancher rancher-latest/rancher --namespace cattle-system --set hostname=rancher.awesome.com --set bootstrapPassword=pleas-change-me --set ingress.tls.source=letsEncrypt --set [email protected] --set letsEncrypt.ingress.class=nginx

But unfortunately got an issue reported here: rancher/rke2#7433 .

What do you think?

@clemenko
Copy link
Author

Brandon is really sharp. map out on that thread how your network is laid out.

@joshyorko
Copy link

Hi @andisugandi,

I know this is probably unrelated to your connection timeout issue with the Rancher endpoint, but I wanted to share a recommendation regarding the method you used to install the certificates.

I’ve tried a similar approach in the past, and while it wasn’t the cause of your specific issue, it did fail for me due to Rancher not receiving the full certificate chain. This led to problems with TLS handshake validation.

Instead, I’d highly recommend following the method outlined in [Techno Tim’s guide](https://technotim.live/posts/kube-traefik-cert-manager-le/) for setting up wildcard certificates with Traefik and cert-manager. His guide walks you through using DNS validation with your DNS provider’s API token to request certificates directly from Let’s Encrypt. You can even practice with staging certificates, but I’d suggest going “full send” and using production.

Why This Matters:

Rancher handles certificates differently and requires the full certificate chain for proper TLS validation. If this chain isn’t provided, it can lead to handshake failures. To ensure everything works smoothly, you need to prepare your certificates and secrets before installing Rancher.

The Correct Process:

Assuming you’ve followed Tim’s guide and cert-manager has issued certificates, you should have a certificate and corresponding secret in your namespace. For example:

➜ kubectl get certificates
NAME       READY   SECRET         AGE
yorko-io   True    yorko-io-tls   11m

➜ kubectl get secrets
NAME           TYPE                DATA   AGE
yorko-io-tls   kubernetes.io/tls   2      13m

Now, here’s how you prepare these certificates for Rancher:

Steps:

  1. Create the namespace for Rancher:

    kubectl create namespace cattle-system
  2. Add the Rancher Helm repository and update it:

    helm repo add rancher-latest https://releases.rancher.com/server-charts/latest
    helm repo update
  3. (Optional) Generate a Rancher admin password:

    RANCHER_PASSWORD=$(openssl rand -base64 12)
  4. Extract the tls.crt from your cert-manager secret:

    kubectl get secret yorko-io-tls -n default -o jsonpath='{.data.tls\.crt}' | base64 --decode > tls.crt
  5. Extract the tls.key from your cert-manager secret:

    kubectl get secret yorko-io-tls -n default -o jsonpath='{.data.tls\.key}' | base64 --decode > tls.key
  6. Download the Let’s Encrypt CA certificate:

    curl -o letsencrypt-ca.pem https://letsencrypt.org/certs/isrgrootx1.pem
  7. Combine the domain certificate and CA certificate:

    cat tls.crt > combined-tls.crt && cat letsencrypt-ca.pem >> combined-tls.crt
  8. Create a Kubernetes secret for the CA certificate:

    kubectl -n cattle-system create secret generic tls-ca --from-file=cacerts.pem=letsencrypt-ca.pem --dry-run=client -o yaml | kubectl apply -f -
  9. Create or update the Rancher TLS secret for the ingress:

    kubectl -n cattle-system create secret tls tls-rancher-ingress --cert=combined-tls.crt --key=tls.key --dry-run=client -o yaml | kubectl apply -f -
  10. Install or upgrade Rancher using Helm:

    helm upgrade -i rancher rancher-latest/rancher -n cattle-system --create-namespace \
      --set hostname=rancher.yorko.io \
      --set bootstrapPassword=bootStrapAllTheThings \
      --set replicas=1 \
      --set ingress.tls.source=secret \
      --set ingress.tls.secretName=tls-rancher-ingress \
      --set privateCA=true

Why This Way Works:

Rancher requires the full certificate chain to be presented during the TLS handshake. The single domain certificate issued by Let’s Encrypt doesn’t include the intermediate CA, which some clients need to validate the connection. By combining the certificates and properly configuring the Helm chart, you ensure compatibility across all clients and prevent TLS handshake issues.

I hope this helps resolve any certificate-related issues and simplifies your setup for the future! Let me know if you have questions.

@ShubhamDesai17
Copy link

ShubhamDesai17 commented Sep 15, 2025

Hi @clemenko

I am not able to register node with rancher
nodes goes in waitingfordatasecreat state

[INFO] Role requested: etcd
[INFO] Role requested: controlplane
[INFO] CA strict verification is set to false
[INFO] Using default agent configuration directory /etc/rancher/agent
[INFO] Using default agent var directory /var/lib/rancher/agent
curl: (28) Operation timed out after 60002 milliseconds with 0 bytes received
[ERROR] 000 received while testing Rancher connection. Sleeping for 5 seconds and trying again
[INFO] Successfully tested Rancher connection
[INFO] Downloading rancher-system-agent binary from https:///assets/rancher-system-agent-amd64
[INFO] Successfully downloaded the rancher-system-agent binary.
[INFO] Downloading rancher-system-agent-uninstall.sh script from https:///assets/system-agent-uninstall.sh
[INFO] Successfully downloaded the rancher-system-agent-uninstall.sh script.
[INFO] Generating Cattle ID
curl: (28) Operation timed out after 60002 milliseconds with 0 bytes received
[ERROR] 000 received while downloading Rancher connection information. Sleeping for 5 seconds and trying again
curl: (28) Operation timed out after 60002 milliseconds with 0 bytes received
[ERROR] 000 received while downloading Rancher connection information. Sleeping for 5 seconds and trying again

getting this error message
could anyone help me to resolve this issue

Thank you

@clemenko
Copy link
Author

There are a lot of variables to check. What does your config.yaml look like on the server and the agent nodes?
Did you change certs?
What does the network look like for the nodes?

@ShubhamDesai17
Copy link

thank you @clemenko

I'm using RKE2 with Rancher installed via Helm in the cattle-system namespace.
I did not set a config.yaml manually for Rancher itself, but here are the relevant Helm values passed during installation:

hostname:
replicas: 3
ingress:
enabled: true
tls:
source: secret
secretName: tls-ingress
privateCA: true

The tls-ingress secret contains:
tls.crt: TLS certificate (Authorized CA), (with full chain, including intermediate)
tls.key: Private key
and also created secret with root CA

I have created cluster on rancher and using the node registration command provided by Rancher UI. Here's the structure of the command I'm using:
curl -fL https:///system-agent-install.sh | sudo sh -s - --server https:// --label 'cattle.io/os=linux' --token --ca-checksum --etcd --controlplane

The agent successfully:
Connects to Rancher
Downloads the system-agent binary and uninstall script
But times out when trying to fetch Rancher connection info:

[ERROR] 000 received while downloading Rancher connection information. Sleeping for 5 seconds and trying again
curl: (28) Operation timed out after 60002 milliseconds with 0 bytes received

steps I followed:

  1. Deploy Rancher on RKE2 cluster using Helm with TLS certificate.
  2. Access Rancher via https:// , UI loads correctly in the browser.
  3. Create a custom cluster on rancher UI and copy the node registration command.
  4. Run the node registration script on a separate Ubuntu machine.
  5. Observe the timeout during "downloading Rancher connection information".

For additional Information

image

@clemenko
Copy link
Author

oh cool. Did you blank out he server name in the command curl -fL https:///system-agent-install.sh | sudo sh -s - --server https:// --label 'cattle.io/os=linux' --token --ca-checksum --etcd --controlplane or was it kike that from the system?

@clemenko
Copy link
Author

Also what kind of cluster are you adding? I am running "Import Existing" and it gives me the following curl. Notice the full url.

Screenshot 2025-09-15 at 1 15 01 PM

Similar with create.

Screenshot 2025-09-15 at 1 16 51 PM

@ShubhamDesai17
Copy link

I have create custom cluster using rancher UI
and try to run registration command on each of the node by defining the role as controlplane, etcd or worker
my command is as shown in 2nd screenshot

@clemenko
Copy link
Author

Can you confirm that A. command has a server address in it like "rancher.rfed.io" in mine. And B. That the nodes have 443/6443 access to the Rancher server?

@ShubhamDesai17
Copy link

yes, command has server address and also have 443/6443 access
I troubleshoot little and think this is problem related to token validation or authontication

@clemenko
Copy link
Author

Is there a script output from the node itself?

@ShubhamDesai17
Copy link

i saw rancher script to register a new node (system-agent-install.sh)
I stuck at connecting to https:///v3/connect/register

when manually try to curl this url, I get 401 Authontication error

@clemenko
Copy link
Author

Are you able to join the Rancher Users Slack https://slack.rancher.io/ ? That would be a better place to post logs and other conversations.

@ShubhamDesai17
Copy link

sure, Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment