RKE reference for in-tree provider configuration
A cluster configured using RKE needs to include the following section in cluster.yaml
in order to specify that we want to use an external Cloud Provider:
cloud_provider:
name: external
customCloudProvider: |-
cloudConfigType: "merge"
loadBalancerSku: "standard"
Additionally, the list of nodes must have the hostname_override
option set for each node, for example:
nodes:
- address: 20.54.122.134
internal_address: 10.1.10.4
hostname_override: control-node-0
user: azureuser
role:
- controlplane
- etcd
There's a couple of things we need to do to get the external Azure Cloud Provider working:
- Define a secret with the necessary credentials for the Cloud Controller Manager to be able to talk to the Azure API;
- Configure RBAC for access to this secret;
- Deploy the Cloud Controller Manager.
Create a file called azure.json
and populate it with the following:
{
"tenantId": "",
"aadClientId": "",
"aadClientSecret": "",
"subscriptionId": "",
"resourceGroup": "",
"location": "",
"subnetName": "",
"securityGroupName": "",
"vnetName": "",
"cloudProviderBackoff": false,
"useManagedIdentityExtension": false,
"useInstanceMetadata": true
}
Update the values for each of the empty keys. We then need to base64-encode this file for inclusion in our Secret resource definition:
$ base64 azure.json
ewogICAgInRlbmFudElkIjogIiIsCiAgICAiYWFkQ2xpZW50SWQiOiAiIiwKICAgICJhYWRDbGllbnRTZWNyZXQiOiAiIiwKICAgICJzdWJzY3JpcHRpb25JZCI6ICIiLAogICAgInJlc291cmNlR3JvdXAiOiAiIiwKICAgICJsb2NhdGlvbiI6ICIiLAogICAgInN1Ym5ldE5hbWUiOiAiIiwKICAgICJzZWN1cml0eUdyb3VwTmFtZSI6ICIiLAogICAgInZuZXROYW1lIjogIiIsCiAgICAiY2xvdWRQcm92aWRlckJhY2tvZmYiOiBmYWxzZSwKICAgICJ1c2VNYW5hZ2VkSWRlbnRpdHlFeHRlbnNpb24iOiBmYWxzZSwKICAgICJ1c2VJbnN0YW5jZU1ldGFkYXRhIjogdHJ1ZQp9Cgo=
Use this to create secret.yaml
:
apiVersion: v1
kind: Secret
metadata:
name: azure-cloud-provider
namespace: kube-system
type: Opaque
data:
cloud-config: |
ewogICAgInRlbmFudElkIjogIiIsCiAgICAiYWFkQ2xpZW50SWQiOiAiIiwKICAgICJhYWRDbGllbnRTZWNyZXQiOiAiIiwKICAgICJzdWJzY3JpcHRpb25JZCI6ICIiLAogICAgInJlc291cmNlR3JvdXAiOiAiIiwKICAgICJsb2NhdGlvbiI6ICIiLAogICAgInN1Ym5ldE5hbWUiOiAiIiwKICAgICJzZWN1cml0eUdyb3VwTmFtZSI6ICIiLAogICAgInZuZXROYW1lIjogIiIsCiAgICAiY2xvdWRQcm92aWRlckJhY2tvZmYiOiBmYWxzZSwKICAgICJ1c2VNYW5hZ2VkSWRlbnRpdHlFeHRlbnNpb24iOiBmYWxzZSwKICAgICJ1c2VJbnN0YW5jZU1ldGFkYXRhIjogdHJ1ZQp9Cgo=
And also create secretrbac.yaml
with the following contents:
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
labels:
kubernetes.io/cluster-service: "true"
name: system:azure-cloud-provider-secret-getter
rules:
- apiGroups: [""]
resources: ["secrets"]
resourceNames: ["azure-cloud-provider"]
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
labels:
kubernetes.io/cluster-service: "true"
name: system:azure-cloud-provider-secret-getter
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:azure-cloud-provider-secret-getter
subjects:
- kind: ServiceAccount
name: azure-cloud-provider
namespace: kube-system
Then we can create these resources:
$ kubectl apply -f secret.yaml
secret/azure-cloud-provider created
$ kubectl apply -f secretrbac.yaml
Warning: rbac.authorization.k8s.io/v1beta1 ClusterRole is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 ClusterRole
clusterrole.rbac.authorization.k8s.io/system:azure-cloud-provider-secret-getter created
Warning: rbac.authorization.k8s.io/v1beta1 ClusterRoleBinding is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 ClusterRoleBinding
clusterrolebinding.rbac.authorization.k8s.io/system:azure-cloud-provider-secret-getter created
At this point we're ready to deploy the Cloud Controller Manager.
Create a file called cloud-controller-manager.yaml
and populate it with the following:
apiVersion: v1
kind: ServiceAccount
metadata:
name: cloud-controller-manager
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:cloud-controller-manager
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
k8s-app: cloud-controller-manager
rules:
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- update
- apiGroups:
- ""
resources:
- nodes
verbs:
- "*"
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
- apiGroups:
- ""
resources:
- services
verbs:
- list
- patch
- update
- watch
- apiGroups:
- ""
resources:
- services/status
verbs:
- list
- patch
- update
- watch
- apiGroups:
- ""
resources:
- serviceaccounts
verbs:
- create
- get
- list
- watch
- update
- apiGroups:
- ""
resources:
- persistentvolumes
verbs:
- get
- list
- update
- watch
- apiGroups:
- ""
resources:
- endpoints
verbs:
- create
- get
- list
- watch
- update
- apiGroups:
- ""
resources:
- secrets
verbs:
- get
- list
- watch
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- get
- create
- update
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: system:cloud-controller-manager
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:cloud-controller-manager
subjects:
- kind: ServiceAccount
name: cloud-controller-manager
namespace: kube-system
- kind: User
name: cloud-controller-manager
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: system:cloud-controller-manager:extension-apiserver-authentication-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: cloud-controller-manager
namespace: kube-system
- apiGroup: ""
kind: User
name: cloud-controller-manager
---
apiVersion: v1
kind: Pod
metadata:
name: cloud-controller-manager
namespace: kube-system
labels:
tier: control-plane
component: cloud-controller-manager
spec:
priorityClassName: system-node-critical
hostNetwork: true
serviceAccountName: cloud-controller-manager
tolerations:
- key: node.cloudprovider.kubernetes.io/uninitialized
value: "true"
effect: NoSchedule
- key: node-role.kubernetes.io/controlplane
value: "true"
effect: NoSchedule
- key: node-role.kubernetes.io/etcd
value: "true"
effect: NoExecute
nodeSelector:
node-role.kubernetes.io/controlplane: "true"
containers:
- name: cloud-controller-manager
image: mcr.microsoft.com/oss/kubernetes/azure-cloud-controller-manager:v0.5.1
imagePullPolicy: IfNotPresent
command: ["cloud-controller-manager"]
args:
- "--allocate-node-cidrs=true"
- "--cloud-config=/etc/kubernetes/cloud-config"
- "--cloud-provider=azure"
- "--cluster-cidr=10.42.0.0/16"
- "--cluster-name=azurenetes"
- "--controllers=*,-cloud-node" # disable cloud-node controller
- "--configure-cloud-routes=false" # "false" for Azure CNI and "true" for other network plugins
- "--leader-elect=true"
- "--route-reconciliation-period=10s"
- "--v=2"
- "--port=10267"
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: "4"
memory: 2Gi
livenessProbe:
httpGet:
path: /healthz
port: 10267
initialDelaySeconds: 20
periodSeconds: 10
timeoutSeconds: 5
volumeMounts:
- name: etc-kubernetes
mountPath: /etc/kubernetes
- name: etc-ssl
mountPath: /etc/ssl
readOnly: true
- name: msi
mountPath: /var/lib/waagent/ManagedIdentity-Settings
readOnly: true
volumes:
- name: etc-kubernetes
hostPath:
path: /etc/kubernetes
- name: etc-ssl
hostPath:
path: /etc/ssl
- name: msi
hostPath:
path: /var/lib/waagent/ManagedIdentity-Settings
Key things to note in this file are the options being passed to the cloud-controller-manager
container itself, and that you should change the cluster-name
value as well as the cluster-cidr
if you've deviated from the RKE defaults.
Apply this file to the cluster:
$ kubectl apply -f cloud-controller-manager.yaml
This will launch CCM in the kube-system
namespace. Keep an eye on this Pod's logs as it starts up:
$ kubectl logs -f cloud-controller-manager -n kube-system
Things to look out for are the right flags being put into place, e.g:
I1126 12:30:10.782231 1 flags.go:33] FLAG: --cluster-name="azurenetes"
And that it's detected the nodes in our cluster:
I1126 12:30:11.972288 1 controller.go:669] Detected change in list of current cluster nodes. New node set: map[control-node-0:{} worker-node-0:{} worker-node-1:{} worker-node-2:{}]
If everything's configured correctly, we should be able to create a Service of type LoadBalancer and the Azure Cloud Controller Manager should handle creating that object in Azure for us automatically. Let's define a simple Service of this type:
$ kubectl apply -f - << EOF
apiVersion: v1
kind: Service
metadata:
name: test
namespace: default
spec:
ports:
- port: 8080
protocol: TCP
targetPort: 8080
sessionAffinity: None
type: LoadBalancer
EOF
Keep an eye on the logs for the cloud-controller-manager
Pod and all being well you should see it creating our Azure LoadBalancer:
I1126 12:45:51.013505 1 controller.go:347] Ensuring load balancer for service default/test
I1126 12:45:51.013531 1 controller.go:814] Adding finalizer to service default/test
I1126 12:45:51.014515 1 event.go:278] Event(v1.ObjectReference{Kind:"Service", Namespace:"default", Name:"test", UID:"6b29edb2-7df4-4ef3-a7ad-9227bbf7016e", APIVersion:"v1", ResourceVersion:"5186", FieldPath:""}): type: 'Normal' reason: 'EnsuringLoadBalancer' Ensuring load balancer
[..]
I1126 12:45:56.654927 1 event.go:278] Event(v1.ObjectReference{Kind:"Service", Namespace:"default", Name:"test", UID:"6b29edb2-7df4-4ef3-a7ad-9227bbf7016e", APIVersion:"v1", ResourceVersion:"5186", FieldPath:""}): type: 'Normal' reason: 'EnsuredLoadBalancer' Ensured load balancer
Examining this Service in Kubernetes should show that it's been created successfully, and in my example has a public IP address associated:
$ kubectl get svc test
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
test LoadBalancer 10.43.68.51 51.104.184.237 8080:31631/TCP 101s
Finally, we can test deleting this resource:
$ kubectl delete svc test
service "test" deleted
I1126 13:20:43.919629 1 azure_loadbalancer.go:676] reconcileLoadBalancer for service(default/test) - wantLb(false): started
[..]
I1126 13:21:24.517016 1 event.go:278] Event(v1.ObjectReference{Kind:"Service", Namespace:"default", Name:"test", UID:"6b29edb2-7df4-4ef3-a7ad-9227bbf7016e", APIVersion:"v1", ResourceVersion:"12723", FieldPath:""}): type: 'Normal' reason: 'DeletedLoadBalancer' Deleted load balancer