Skip to content

Instantly share code, notes, and snippets.

@yankcrime
Last active February 11, 2022 21:02
Show Gist options
  • Save yankcrime/cbec25b5b40acab2c45f690f04ffd3e1 to your computer and use it in GitHub Desktop.
Save yankcrime/cbec25b5b40acab2c45f690f04ffd3e1 to your computer and use it in GitHub Desktop.
Configuring RKE with the Azure out-of-tree external Cloud Provider

Configuring the Azure out-of-tree external Cloud Provider

References

Cloud Provider Azure SIG

RKE reference for in-tree provider configuration

RKE

A cluster configured using RKE needs to include the following section in cluster.yaml in order to specify that we want to use an external Cloud Provider:

cloud_provider:
  name: external
  customCloudProvider: |-
      cloudConfigType: "merge"
      loadBalancerSku: "standard"

Additionally, the list of nodes must have the hostname_override option set for each node, for example:

nodes:
    - address: 20.54.122.134
      internal_address: 10.1.10.4
      hostname_override: control-node-0
      user: azureuser
      role:
        - controlplane
        - etcd

Post-deployment configuration

There's a couple of things we need to do to get the external Azure Cloud Provider working:

  • Define a secret with the necessary credentials for the Cloud Controller Manager to be able to talk to the Azure API;
  • Configure RBAC for access to this secret;
  • Deploy the Cloud Controller Manager.

Define the Azure CP secret

Create a file called azure.json and populate it with the following:

{
    "tenantId": "",
    "aadClientId": "",
    "aadClientSecret": "",
    "subscriptionId": "",
    "resourceGroup": "",
    "location": "",
    "subnetName": "",
    "securityGroupName": "",
    "vnetName": "",
    "cloudProviderBackoff": false,
    "useManagedIdentityExtension": false,
    "useInstanceMetadata": true
}

Update the values for each of the empty keys. We then need to base64-encode this file for inclusion in our Secret resource definition:

$ base64 azure.json
ewogICAgInRlbmFudElkIjogIiIsCiAgICAiYWFkQ2xpZW50SWQiOiAiIiwKICAgICJhYWRDbGllbnRTZWNyZXQiOiAiIiwKICAgICJzdWJzY3JpcHRpb25JZCI6ICIiLAogICAgInJlc291cmNlR3JvdXAiOiAiIiwKICAgICJsb2NhdGlvbiI6ICIiLAogICAgInN1Ym5ldE5hbWUiOiAiIiwKICAgICJzZWN1cml0eUdyb3VwTmFtZSI6ICIiLAogICAgInZuZXROYW1lIjogIiIsCiAgICAiY2xvdWRQcm92aWRlckJhY2tvZmYiOiBmYWxzZSwKICAgICJ1c2VNYW5hZ2VkSWRlbnRpdHlFeHRlbnNpb24iOiBmYWxzZSwKICAgICJ1c2VJbnN0YW5jZU1ldGFkYXRhIjogdHJ1ZQp9Cgo=

Use this to create secret.yaml:

apiVersion: v1
kind: Secret
metadata:
  name: azure-cloud-provider
  namespace: kube-system
type: Opaque
data:
  cloud-config: |
ewogICAgInRlbmFudElkIjogIiIsCiAgICAiYWFkQ2xpZW50SWQiOiAiIiwKICAgICJhYWRDbGllbnRTZWNyZXQiOiAiIiwKICAgICJzdWJzY3JpcHRpb25JZCI6ICIiLAogICAgInJlc291cmNlR3JvdXAiOiAiIiwKICAgICJsb2NhdGlvbiI6ICIiLAogICAgInN1Ym5ldE5hbWUiOiAiIiwKICAgICJzZWN1cml0eUdyb3VwTmFtZSI6ICIiLAogICAgInZuZXROYW1lIjogIiIsCiAgICAiY2xvdWRQcm92aWRlckJhY2tvZmYiOiBmYWxzZSwKICAgICJ1c2VNYW5hZ2VkSWRlbnRpdHlFeHRlbnNpb24iOiBmYWxzZSwKICAgICJ1c2VJbnN0YW5jZU1ldGFkYXRhIjogdHJ1ZQp9Cgo=

And also create secretrbac.yaml with the following contents:

---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  labels:
    kubernetes.io/cluster-service: "true"
  name: system:azure-cloud-provider-secret-getter
rules:
- apiGroups: [""]
  resources: ["secrets"]
  resourceNames: ["azure-cloud-provider"]
  verbs:
  - get
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  labels:
    kubernetes.io/cluster-service: "true"
  name: system:azure-cloud-provider-secret-getter
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:azure-cloud-provider-secret-getter
subjects:
- kind: ServiceAccount
  name: azure-cloud-provider
  namespace: kube-system

Then we can create these resources:

$ kubectl apply -f secret.yaml
secret/azure-cloud-provider created
$ kubectl apply -f secretrbac.yaml
Warning: rbac.authorization.k8s.io/v1beta1 ClusterRole is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 ClusterRole
clusterrole.rbac.authorization.k8s.io/system:azure-cloud-provider-secret-getter created
Warning: rbac.authorization.k8s.io/v1beta1 ClusterRoleBinding is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 ClusterRoleBinding
clusterrolebinding.rbac.authorization.k8s.io/system:azure-cloud-provider-secret-getter created

At this point we're ready to deploy the Cloud Controller Manager.

Azure Cloud Controller Manager

Create a file called cloud-controller-manager.yaml and populate it with the following:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: cloud-controller-manager
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:cloud-controller-manager
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
  labels:
    k8s-app: cloud-controller-manager
rules:
  - apiGroups:
      - ""
    resources:
      - events
    verbs:
      - create
      - patch
      - update
  - apiGroups:
      - ""
    resources:
      - nodes
    verbs:
      - "*"
  - apiGroups:
      - ""
    resources:
      - nodes/status
    verbs:
      - patch
  - apiGroups:
      - ""
    resources:
      - services
    verbs:
      - list
      - patch
      - update
      - watch
  - apiGroups:
      - ""
    resources:
      - services/status
    verbs:
      - list
      - patch
      - update
      - watch
  - apiGroups:
      - ""
    resources:
      - serviceaccounts
    verbs:
      - create
      - get
      - list
      - watch
      - update
  - apiGroups:
      - ""
    resources:
      - persistentvolumes
    verbs:
      - get
      - list
      - update
      - watch
  - apiGroups:
      - ""
    resources:
      - endpoints
    verbs:
      - create
      - get
      - list
      - watch
      - update
  - apiGroups:
      - ""
    resources:
      - secrets
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - coordination.k8s.io
    resources:
      - leases
    verbs:
      - get
      - create
      - update
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: system:cloud-controller-manager
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:cloud-controller-manager
subjects:
  - kind: ServiceAccount
    name: cloud-controller-manager
    namespace: kube-system
  - kind: User
    name: cloud-controller-manager
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: system:cloud-controller-manager:extension-apiserver-authentication-reader
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: extension-apiserver-authentication-reader
subjects:
  - kind: ServiceAccount
    name: cloud-controller-manager
    namespace: kube-system
  - apiGroup: ""
    kind: User
    name: cloud-controller-manager
---
apiVersion: v1
kind: Pod
metadata:
  name: cloud-controller-manager
  namespace: kube-system
  labels:
    tier: control-plane
    component: cloud-controller-manager
spec:
  priorityClassName: system-node-critical
  hostNetwork: true
  serviceAccountName: cloud-controller-manager
  tolerations:
    - key: node.cloudprovider.kubernetes.io/uninitialized
      value: "true"
      effect: NoSchedule
    - key: node-role.kubernetes.io/controlplane
      value: "true"
      effect: NoSchedule
    - key: node-role.kubernetes.io/etcd
      value: "true"
      effect: NoExecute
  nodeSelector:
    node-role.kubernetes.io/controlplane: "true"
  containers:
    - name: cloud-controller-manager
      image: mcr.microsoft.com/oss/kubernetes/azure-cloud-controller-manager:v0.5.1
      imagePullPolicy: IfNotPresent
      command: ["cloud-controller-manager"]
      args:
        - "--allocate-node-cidrs=true"
        - "--cloud-config=/etc/kubernetes/cloud-config"
        - "--cloud-provider=azure"
        - "--cluster-cidr=10.42.0.0/16"
        - "--cluster-name=azurenetes"
        - "--controllers=*,-cloud-node" # disable cloud-node controller
        - "--configure-cloud-routes=false" # "false" for Azure CNI and "true" for other network plugins
        - "--leader-elect=true"
        - "--route-reconciliation-period=10s"
        - "--v=2"
        - "--port=10267"
      resources:
        requests:
          cpu: 100m
          memory: 128Mi
        limits:
          cpu: "4"
          memory: 2Gi
      livenessProbe:
        httpGet:
          path: /healthz
          port: 10267
        initialDelaySeconds: 20
        periodSeconds: 10
        timeoutSeconds: 5
      volumeMounts:
        - name: etc-kubernetes
          mountPath: /etc/kubernetes
        - name: etc-ssl
          mountPath: /etc/ssl
          readOnly: true
        - name: msi
          mountPath: /var/lib/waagent/ManagedIdentity-Settings
          readOnly: true
  volumes:
    - name: etc-kubernetes
      hostPath:
        path: /etc/kubernetes
    - name: etc-ssl
      hostPath:
        path: /etc/ssl
    - name: msi
      hostPath:
        path: /var/lib/waagent/ManagedIdentity-Settings

Key things to note in this file are the options being passed to the cloud-controller-manager container itself, and that you should change the cluster-name value as well as the cluster-cidr if you've deviated from the RKE defaults.

Apply this file to the cluster:

$ kubectl apply -f cloud-controller-manager.yaml

This will launch CCM in the kube-system namespace. Keep an eye on this Pod's logs as it starts up:

$ kubectl logs -f cloud-controller-manager -n kube-system

Things to look out for are the right flags being put into place, e.g:

I1126 12:30:10.782231       1 flags.go:33] FLAG: --cluster-name="azurenetes"

And that it's detected the nodes in our cluster:

I1126 12:30:11.972288       1 controller.go:669] Detected change in list of current cluster nodes. New node set: map[control-node-0:{} worker-node-0:{} worker-node-1:{} worker-node-2:{}]

Testing the Cloud Controller Manager

If everything's configured correctly, we should be able to create a Service of type LoadBalancer and the Azure Cloud Controller Manager should handle creating that object in Azure for us automatically. Let's define a simple Service of this type:

$ kubectl apply -f - << EOF
apiVersion: v1
kind: Service
metadata:
  name: test
  namespace: default
spec:
  ports:
  - port: 8080
    protocol: TCP
    targetPort: 8080
  sessionAffinity: None
  type: LoadBalancer
EOF

Keep an eye on the logs for the cloud-controller-manager Pod and all being well you should see it creating our Azure LoadBalancer:

I1126 12:45:51.013505       1 controller.go:347] Ensuring load balancer for service default/test
I1126 12:45:51.013531       1 controller.go:814] Adding finalizer to service default/test
I1126 12:45:51.014515       1 event.go:278] Event(v1.ObjectReference{Kind:"Service", Namespace:"default", Name:"test", UID:"6b29edb2-7df4-4ef3-a7ad-9227bbf7016e", APIVersion:"v1", ResourceVersion:"5186", FieldPath:""}): type: 'Normal' reason: 'EnsuringLoadBalancer' Ensuring load balancer

[..]

I1126 12:45:56.654927       1 event.go:278] Event(v1.ObjectReference{Kind:"Service", Namespace:"default", Name:"test", UID:"6b29edb2-7df4-4ef3-a7ad-9227bbf7016e", APIVersion:"v1", ResourceVersion:"5186", FieldPath:""}): type: 'Normal' reason: 'EnsuredLoadBalancer' Ensured load balancer

Examining this Service in Kubernetes should show that it's been created successfully, and in my example has a public IP address associated:

$ kubectl get svc test
NAME   TYPE           CLUSTER-IP    EXTERNAL-IP      PORT(S)          AGE
test   LoadBalancer   10.43.68.51   51.104.184.237   8080:31631/TCP   101s

Finally, we can test deleting this resource:

$ kubectl delete svc test
service "test" deleted
I1126 13:20:43.919629       1 azure_loadbalancer.go:676] reconcileLoadBalancer for service(default/test) - wantLb(false): started

[..]

I1126 13:21:24.517016       1 event.go:278] Event(v1.ObjectReference{Kind:"Service", Namespace:"default", Name:"test", UID:"6b29edb2-7df4-4ef3-a7ad-9227bbf7016e", APIVersion:"v1", ResourceVersion:"12723", FieldPath:""}): type: 'Normal' reason: 'DeletedLoadBalancer' Deleted load balancer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment