OpenShift Secure Certificate Demo

As seen on Ask an OpenShift Admin | Ep 137 | Configuring Cluster Certificates

Cluster

Red Hat OpenShift Version 4.16.13
Red Hat CoreOS Version 4.16.3

Provision certs with certbot

We will provision two x509 certificates with ECDSA keys

Wildcard cert for *.apps.<cluster_name>.<base_domain> (Ingress Controller)
Certificate for api.<cluster_name>.<base_domain> (API Server)

When getting a cert you need to prove domain ownership by responding to a Challenge. We'll use the DNS-01 challenge because we want a wildcard certificate, and DNS-01 supports wildcards.

We will add TXT records to our DNS server to meet the DNS Challenge. For this example we'll use --dry-run.

# Add TXT record: name, value, TTL=2 min

# Apps (wildcard)
sudo certbot certonly -d '*.apps.<cluster_name>.<base_domain>' --manual --preferred-challenges dns --dry-run

# API
sudo certbot certonly -d 'api.<cluster_name>.<base_domain>' --manual --preferred-challenges dns --dry-run

# Remove TXT records when finished

# Show certs using certbot
sudo certbot certificates

# Show certs via shell (fedora in this case)
sudo tree /etc/letsencrypt/live

We get four files for each cert. We'll use an apps and api directory for each certificate throughout this demo:

cert.pem: Our x509 Certificate
chain.pem: Certificate chain from the CA
fullchain.pem: Cert + Chain in one file
privkey.pem: Private key (ECDSA)

PEM: Privacy Enhanced Mail format. Certs will start with BEGIN CERTIFICATE and finish with END CERTIFICATE. Private keys will use PRIVATE KEY.

ECDSA: Elliptic Curve Digital Signature Algorithm. Elliptic curve cryptography. One telltale sign: The public and private keys are way shorter in length.

Review cert subjects and issuers

for domain in apps api
do
  echo Domain: ${domain}.<cluster_name>.<base_domain>
  while openssl x509 -noout -subject -issuer -ext subjectAltName 2>/dev/null; do :; done < ${domain}/cert.pem
  echo
done

To see all info for the first cert in the file: openssl x509 -noout -text -in CERT_FILE

Cert Prerequisites for OpenShift

Cert for the FQDN + private key in separate files
Private key must be unencrypted
Cert must have Subject Alternative Name set to the FQDN (for wildcards)
Cert file can have a chain of trust (multiple certs in one file)

The order of that chain of trust is:

Issued Certificate
Intermediate CA Certificate(s), if any
Root CA Certificate

Everything after the issued cert forms the "CA chain".

Verify Issued cert against CA chain

When there's a CA chain in the cert file, verify that it's ok before using it. Let's use our fullchain files:

for domain in apps api
do
  echo Domain: ${domain}.<cluster_name>.<base_domain>
  openssl verify -verbose -CAfile $domain/fullchain.pem $domain/fullchain.pem
  echo
done

fullchain.pem does not include the full full CA chain. What's missing? Let's look:

for domain in apps api
do
  echo Domain: ${domain}.<cluster_name>.<base_domain>
  while openssl x509 -noout -subject -issuer -ext subjectAltName 2>/dev/null; do :; done < ${domain}/fullchain.pem
  echo
done

ISRG Root X1 is missing!

We can get that from Let's Encrypt and add it to the end of fullchain.pem.

for domain in apps api
do
  echo Domain: ${domain}.<cluster_name>.<base_domain>
  # Via https://letsencrypt.org/certificates/
  curl -o ${domain}/root.pem https://letsencrypt.org/certs/isrgrootx1.pem
  cat ${domain}/fullchain.pem ${domain}/root.pem > ${domain}/fullchain.root.pem
done

Look at the subject and issuer now:

for domain in apps api
do
  echo Domain: ${domain}.<cluster_name>.<base_domain>
  while openssl x509 -noout -subject -issuer -ext subjectAltName 2>/dev/null; do :; done < ${domain}/fullchain.root.pem
  echo
done

Now try again with the full chain + root file:

for domain in apps api
do
  echo Domain: ${domain}.<cluster_name>.<base_domain>
  openssl verify -verbose -CAfile $domain/fullchain.root.pem $domain/fullchain.root.pem
  echo
done

Verify issued cert against the private key

We're using ECDSA keys, so we will match public key output.

for domain in apps api
do
  # ECDSA key: match public key output
  echo Domain: ${domain}.<cluster_name>.<base_domain>
  openssl x509 -in $domain/cert.pem -pubkey -noout
  openssl ec -in $domain/privkey.pem -pubout
  echo
done

If we were using x509 cert + RSA key: match md5 output

To generate (using an rsa directory)

openssl genrsa -out rsa/privkey.pem 1024
openssl req -new -x509 -key rsa/privkey.pem -out rsa/cert.pem -days 365

openssl x509 -in rsa/cert.pem -modulus -noout | openssl md5
openssl rsa -in rsa/privkey.pem -modulus -noout | openssl md5

Remove password from key

# No need - ours are passwordless
openssl rsa -in rsa/privkey.pem -out rsa/privkey.nopass.pem

Login to cluster

Remove the local kube config so we start with a clean slate, and login as kubeadmin

rm ~/.kube/config  # rather brute force but we'll go with it for now
oc login -u kubeadmin --server=https://api.<cluster_name>.<base_domain>:6443

kube:admin (top-right) > Copy login command... > Display Token > Log in with this token

Observe that kube config has insecure-skip-tls-verify: true. Backup the kubeconfig.

oc config view --flatten | tee PATH/TO/kubeconfig.latest

Keep restricted. Allows API access without a username/password for a limited time. In case of emergency:

set KUBECONFIG=FULL/PATH/TO/kubeconfig.latest

After rollback or adjustment of API Server, unset and re-verify.

Fun oc tricks

# Diff with cluster before oc apply:
oc diff -f **YAML_FILE**

# Dry run on client - what would have been stored
oc **CREATE_RESOURCE** --dry-run=client

# Non-destructive dry run on server + validate against schema
oc **CHANGE_RESOURCE** --dry-run=server --validate=true

Cluster-wide proxy

Create config map with root CA

First we will put the root CA into a ConfigMap.

# Bundle name can be whatever you want
export INGRESS_CA_BUNDLE_NAME=custom-ca
export INGRESS_CA_BUNDLE_FILE=apps/root.pem

# Cluster-wide proxy will look for the key "ca-bundle.crt"
oc create configmap $INGRESS_CA_BUNDLE_NAME \
  --namespace=openshift-config \
  --from-file=ca-bundle.crt=$INGRESS_CA_BUNDLE_FILE \
  --dry-run=client -o yaml \
  | tee $INGRESS_CA_BUNDLE_NAME.yaml

# If it looks good, apply it
oc apply -f $INGRESS_CA_BUNDLE_NAME.yaml

Find it in the web console: https://console-openshift-console.apps.<cluster_name>.<base_domain>/k8s/ns/openshift-config/configmaps/custom-ca/yaml

Home > Search > Project: openshift-config > Resource: ConfigMap > Name: custom-ca

Patch cluster-wide proxy config

The proxy config is where we store our Root CA, which the ingress controller will use.

The following command works fine, just be careful with it! Not sure? oc patch also works with --dry-run=server.

oc patch proxy cluster --type=merge \
  --patch='{"spec":{"trustedCA":{"name":"'${INGRESS_CA_BUNDLE_NAME}'"}}}'

Or use the web console: https://console-openshift-console.apps.<cluster_name>.<base_domain>/k8s/cluster/config.openshift.io~v1~Proxy/cluster/yaml

Administration > Cluster Settings > Configuration > Proxy > YAML

spec:
  trustedCA:
    name: custom-ca

Wait for update

watch -n 5 "oc get co | grep -v 'True        False         False'"

Wait for AVAILABLE/PROGRESSING/DEGRADED to reach True/False/False.

Or use the web console: https://console-openshift-console.apps.<cluster_name>.<base_domain>/settings/cluster/clusteroperators?rowFilter-cluster-operator-status=Progressing%2CDegraded%2CCannot+update%2CUnavailable%2CUnknown

Administration > Cluster Settings > ClusterOperators > Filter > Status: Check everything except Available

Wait for the following operators to respond to this change:

authentication
console
image-registry
openshift-apiserver
openshift-controller-manager

Ingress Controller cert and key

Show the current certificate in use

echo Q | openssl s_client \
  -connect console-openshift-console.apps.<cluster_name>.<base_domain>:443 \
  -showcerts 2>/dev/null > apps-certs.out

while openssl x509 -noout -subject -issuer -ext subjectAltName -enddate \
  2>/dev/null; do :; done < apps-certs.out

Create cert secret

# Secret name can be whatever you want
export INGRESS_CERT_SECRET_NAME=ingress

# Can use fullchain without the root CA but we'll keep it.
export INGRESS_CERT_PATH=apps/fullchain.root.pem
export INGRESS_KEY_PATH=apps/privkey.pem

oc create secret tls $INGRESS_CERT_SECRET_NAME \
  --namespace=openshift-ingress \
  --cert=$INGRESS_CERT_PATH \
  --key=$INGRESS_KEY_PATH \
  --dry-run=client -o yaml \
  | tee $INGRESS_CERT_SECRET_NAME.yaml

# If it looks good, apply it
oc apply -f $INGRESS_CERT_SECRET_NAME.yaml

Find it in the web console: https://console-openshift-console.apps.<cluster_name>.<base_domain>/k8s/ns/openshift-ingress/secrets/ingress/yaml

Home > Search > Project > Show default projects: ON > Select project...: openshift-ingress > Resources > Select Resource: secret > Name: ingress

Patch ingress operator config

Tell the ingress operator where to find our cert and key:

oc patch ingresscontroller.operator default --type=merge \
  -p '{"spec":{"defaultCertificate": {"name": "'$INGRESS_CERT_SECRET_NAME'"}}}' \
  -n openshift-ingress-operator

Or use the web console: https://console-openshift-console.apps.<cluster_name>.<base_domain>/k8s/ns/openshift-ingress-operator/operator.openshift.io~v1~IngressController/default/yaml

Administration > Cluster Settings > Configuration > IngressController > YAML

spec:
  defaultCertificate:
    name: ingress
  ... leave the rest as-is ...

Wait for update

watch -n 5 "oc get co | grep -v 'True        False         False'"

Wait for AVAILABLE/PROGRESSING/DEGRADED to reach True/False/False.

Administration > Cluster Settings > ClusterOperators > Filter > Status: Check everything except Available

Wait for the following operators to respond to this change:

authentication
console
ingress
kube-controller-manager
kube-scheduler

Check for evidence of cert

echo Q | openssl s_client \
  -connect console-openshift-console.apps.<cluster_name>.<base_domain>:443 \
  -showcerts 2>/dev/null > apps-certs.out

while openssl x509 -noout -subject -issuer -ext subjectAltName -enddate \
  2>/dev/null; do :; done < apps-certs.out

Check in web browser

Re-open the console in a new browser tab/window and it should now show as being secure.

Troubleshooting

This is for oauth on apps:

curl -vvk https://oauth-openshift.apps.<cluster_name>.<base_domain>/healthz

API Server cert and key

Show the current certificate in use

echo Q | openssl s_client \
  -connect console-openshift-console.apps.<cluster_name>.<base_domain>:443 \
  -showcerts 2>/dev/null > apps-certs.out

while openssl x509 -noout -subject -issuer -ext subjectAltName -enddate \
  2>/dev/null; do :; done < apps-certs.out

Create cert secret

# Secret name can be whatever you want
export API_CERT_SECRET_NAME=api

# MUST use full chain with the Root CA
export API_CERT_PATH=api/fullchain.root.pem
export API_KEY_PATH=api/privkey.pem

oc create secret tls $API_CERT_SECRET_NAME \
  --namespace=openshift-config \
  --cert=$API_CERT_PATH \
  --key=$API_KEY_PATH \
  --dry-run=client -o yaml \
  | tee $API_CERT_SECRET_NAME.yaml

# If it looks good, apply it
oc apply -f $API_CERT_SECRET_NAME.yaml

Or use the web console: https://console-openshift-console.apps.<cluster_name>.<base_domain>/k8s/ns/openshift-config/secrets/api/yaml

Home > Search > Project > Show default projects: ON > Select project...: openshift-config > Resources > Select Resource: secret > Name: api

Patch APIServer config

Tell the API Server where to find our cert and key:

export API_SERVER_FQDN=api.<cluster_name>.<base_domain>

oc patch apiserver cluster --type=merge \
  -p '{"spec":{"servingCerts": {"namedCertificates": [{"names": ["'$API_SERVER_FQDN'"], "servingCertificate": {"name": "'$API_CERT_SECRET_NAME'"}}]}}}'

Or use the web console: https://console-openshift-console.apps.<cluster_name>.<base_domain>/k8s/cluster/config.openshift.io~v1~APIServer/cluster/yaml

Home > Search > Project: openshift-cluster-version > Resource: APIServer > cluster > YAML

spec:
  servingCerts:
    namedCertificates:
    - names:
      - api.<cluster_name>.<base_domain>
      servingCertificate:
        name: api
  ... leave the rest as-is ...

Personal preference: I keep my list indicators inside the 2-character margin. This prevents runaway indentation, and it's still well-formed and valid. Now text will always indent on a 2-character boundary!

Wait for update

watch -n 5 "oc get co | grep -v 'True        False         False'"

Wait for AVAILABLE/PROGRESSING/DEGRADED to reach True/False/False.

Administration > Cluster Settings > ClusterOperators > Filter > Status: Check everything except Available

Wait for the following operator to respond to this change:

kube-apiserver

Check for evidence of cert

echo Q | openssl s_client \
  -connect api.<cluster_name>.<base_domain>:6443 \
  -showcerts 2>/dev/null > api-certs.out

while openssl x509 -noout -subject -issuer -ext subjectAltName -enddate \
  2>/dev/null; do :; done < api-certs.out

Login again

oc login -u kubeadmin --server=https://api.<cluster_name>.<base_domain>:6443

No more unknown authority warning! And the config view no longer mentions skipping TLS verification:

oc config view --flatten

Troubleshooting

curl -vvk https://api.<cluster_name>.<base_domain>:6443/healthz

Rollback

API Server

oc patch apiserver cluster --type json \
  --patch '[{ "op": "remove", "path": "/spec/servingCerts" }]'

Or go back to Cluster Settings and remove servingCerts from the APIServer config.

Wait for update

watch -n 5 "oc get co | grep -v 'True        False         False'"

Wait for AVAILABLE/PROGRESSING/DEGRADED to reach True/False/False.

Administration > Cluster Settings > ClusterOperators > Filter > Status: Check everything except Available

Wait for the following operator to respond to this change:

kube-apiserver

Next time you login with oc you will once again see The server uses a certificate signed by an unknown authority.

Ingress Controller

STOP: If you've still got the API Server Certificate configured, beware that this will cause the oauth app to use the default certificate again and oc login will show this next time you try logging in:

error: tls: failed to verify certificate: x509: certificate signed by unknown authority

To rollback:

oc patch ingresscontroller.operator default --type json \
  --patch '[{ "op": "remove", "path": "/spec/defaultCertificate" }]' \
  -n openshift-ingress-operator

Or go back to Cluster Settings and remove defaultCertificate from the IngressController config.

Wait for update

watch -n 5 "oc get co | grep -v 'True        False         False'"

Wait for AVAILABLE/PROGRESSING/DEGRADED to reach True/False/False.

Or use the web console (if you're daring): https://console-openshift-console.apps.<cluster_name>.<base_domain>/settings/cluster/clusteroperators?rowFilter-cluster-operator-status=Progressing%2CDegraded%2CCannot+update%2CUnavailable%2CUnknown

Administration > Cluster Settings > ClusterOperators > Filter > Status: Check everything except Available

Wait for the following operators to respond to this change:

authentication
console
ingress
kube-controller-manager
kube-scheduler

The OpenShift Console may also briefly show Error Loading ClusterOperators: Failed to fetch (reload), then possibly Error Loading: Failed to fetch.

HTTP Strict Transport Security

At this point you will likely need to work around HSTS (HTTP Strict Transport Security) in order to view the OpenShift console again.

Firefox: You'll see "Did Not Connect: Potential Security Issue"
- Close all console tabs
- Bring up Full History (Ctrl+Shift+H / Cmd+Shift+H on macOS)
- Right-click console-openshift-console.apps.<cluster_name>.<base_domain> > Forget About This Site
Chrome: Page will not load
- Close all console tabs
- Bring up Net Internals: Domain Security Policy at chrome://net-internals/#hsts (must copy/paste, otherwise it gets blocked)
- Query HSTS/PKP domain > console-openshift-console.apps.<cluster_name>.<base_domain> > Query
- If found, enter it in Delete domain > Delete
- May need to do the same for oauth-openshift.apps.<cluster_name>.<base_domain>
- You may also have to restart the browser after this anyway!

Cluster-wide proxy

oc patch proxy cluster --type json \
  --patch '[{ "op": "remove", "path": "/spec/trustedCA" }]'

Or go back to Cluster Settings and remove trustedCA from the Proxy config.

Wait for update

watch -n 5 "oc get co | grep -v 'True        False         False'"

Wait for AVAILABLE/PROGRESSING/DEGRADED to reach True/False/False.

Administration > Cluster Settings > ClusterOperators > Filter > Status: Check everything except Available

Wait for the following operators to respond to this change:

authentication
console
image-registry
openshift-apiserver
openshift-controller-manager

jdandrea/openshift-cluster-certificates-101.md

OpenShift Secure Certificate Demo

Cluster

Provision certs with certbot

Review cert subjects and issuers

Cert Prerequisites for OpenShift

Verify Issued cert against CA chain

Verify issued cert against the private key

Remove password from key

Login to cluster

Fun oc tricks

Cluster-wide proxy

Create config map with root CA

Patch cluster-wide proxy config

Wait for update

Ingress Controller cert and key

Show the current certificate in use

Create cert secret

Patch ingress operator config

Wait for update

Check for evidence of cert

Check in web browser

Troubleshooting

API Server cert and key

Show the current certificate in use

Create cert secret

Patch APIServer config

Wait for update

Check for evidence of cert

Login again

Troubleshooting

Rollback

API Server

Wait for update

Ingress Controller

Wait for update

HTTP Strict Transport Security

Cluster-wide proxy

Wait for update

Appendix

Links of Interest