This PoC is about 3 different things:
- ResourceClaim Status for Networking
- Container Runtime as DRA Driver
- Network DRA Driver as Re-Usable Framework/Pattern
Repositories:
- https://github.com/LionelJouin/multi-network/tree/framework
- https://github.com/LionelJouin/kubernetes/tree/dra-device-status
- https://github.com/LionelJouin/containerd/tree/dra-cni
Summary:
- NodePrepareResources is called from Kubelet to the DRA Driver with the list of claim names/UIDs to prepare.
- The Claims are retrieved from the Kubernetes API, so the devices are prepared (Stored in order to be used when CNI ADD will be called) and returned from the NodePrepareResources call.
- Kubelet calls RunPodSandbox to the Container Runtime, in order to create the Pod.
- During the RunPodSandbox process, the claims for the pod currently handled are retrieved from step 2, and the CNIs are called based on the information contained in the claims.
- The status is set and updated via the Kubernetes API, then the RunPodSandbox call is finished.
The ResourceClaimStatus
has been extended to contain a new field:
AllocatedDeviceStatus
: A field containing the status of an allocated device. This contains two ways to report actual data of the device:DeviceInfo
: A field accepting any kind of data like the opaque parameters (.spec.devices.config.opaque.parameters).NetworkDeviceInfo
: A field only for the network devices.
// ResourceClaimStatus tracks whether the resource has been allocated and what
// the result of that was.
type ResourceClaimStatus struct {
...
// DeviceStatuses contains the status of each device allocated for this
// claim, as reported by the driver. This can include driver-specific
// information. Entries are owned by their respective drivers.
//
// +optional
// +listType=map
// +listMapKey=devicePoolName
// +listMapKey=deviceName
DeviceStatuses []AllocatedDeviceStatus `json:"deviceStatuses,omitempty" protobuf:"bytes,4,opt,name=deviceStatuses"`
}
// AllocatedDeviceStatus contains the status of an allocated device, if the
// driver chooses to report it. This may include driver-specific information.
type AllocatedDeviceStatus struct {
// Request is the name of the request in the claim which caused this
// device to be allocated. Multiple devices may have been allocated
// per request.
//
// +required
Request string `json:"request" protobuf:"bytes,1,rep,name=request"`
// Driver specifies the name of the DRA driver whose kubelet
// plugin should be invoked to process the allocation once the claim is
// needed on a node.
//
// Must be a DNS subdomain and should end with a DNS domain owned by the
// vendor of the driver.
//
// +required
Driver string `json:"driver" protobuf:"bytes,2,rep,name=driver"`
// This name together with the driver name and the device name field
// identify which device was allocated (`<driver name>/<pool name>/<device name>`).
//
// Must not be longer than 253 characters and may contain one or more
// DNS sub-domains separated by slashes.
//
// +required
Pool string `json:"pool" protobuf:"bytes,3,rep,name=pool"`
// Device references one device instance via its name in the driver's
// resource pool. It must be a DNS label.
//
// +required
Device string `json:"device" protobuf:"bytes,4,rep,name=device"`
// Conditions contains the latest observation of the device's state.
// If the device has been configured according to the class and claim
// config references, the `Ready` condition should be True.
//
// +optional
// +listType=atomic
Conditions []metav1.Condition `json:"conditions" protobuf:"bytes,5,rep,name=conditions"`
// DeviceInfo contains Arbitrary driver-specific data.
//
// +optional
DeviceInfo runtime.RawExtension `json:"deviceInfo,omitempty" protobuf:"bytes,6,rep,name=deviceInfo"`
// NetworkDeviceInfo contains network-related information specific to the device.
//
// +optional
NetworkDeviceInfo NetworkDeviceInfo `json:"networkDeviceInfo,omitempty" protobuf:"bytes,7,rep,name=networkDeviceInfo"`
}
// NetworkDeviceInfo provides network-related details for the allocated device.
// This information may be filled by drivers or other components to configure
// or identify the device within a network context.
type NetworkDeviceInfo struct {
// Interface specifies the name of the network interface associated with
// the allocated device. This might be the name of a physical or virtual
// network interface.
//
// +optional
Interface string `json:"interface,omitempty" protobuf:"bytes,1,rep,name=interface"`
// IPs lists the IP addresses assigned to the device's network interface.
// This can include both IPv4 and IPv6 addresses.
//
// +optional
IPs []string `json:"ips,omitempty" protobuf:"bytes,2,rep,name=ips"`
// Mac represents the MAC address of the device's network interface.
//
// +optional
Mac string `json:"mac,omitempty" protobuf:"bytes,3,rep,name=mac"`
}
Here is an example of the final ResourceClaim for the demo shown in this PoC:
apiVersion: resource.k8s.io/v1alpha3
kind: ResourceClaim
metadata:
name: macvlan-eth0-attachment
spec:
devices:
config:
- opaque:
driver: poc.dra.networking
parameters:
config: '{ "cniVersion": "1.0.0", "name": "macvlan-eth0", "plugins": [ {
"type": "macvlan", "master": "eth0", "mode": "bridge", "ipam": { "type":
"host-local", "ranges": [ [ { "subnet": "10.10.1.0/24" } ] ] } } ] }'
interface: net1
requests:
- macvlan-eth0
requests:
- allocationMode: ExactCount
count: 1
deviceClassName: cni-v1
name: macvlan-eth0
status:
allocation:
devices:
config:
- opaque:
driver: poc.dra.networking
parameters:
config: '{ "cniVersion": "1.0.0", "name": "macvlan-eth0", "plugins": [
{ "type": "macvlan", "master": "eth0", "mode": "bridge", "ipam": { "type":
"host-local", "ranges": [ [ { "subnet": "10.10.1.0/24" } ] ] } } ] }'
interface: net1
requests:
- macvlan-eth0
source: FromClaim
results:
- device: cni
driver: poc.dra.networking
pool: kind-worker
request: macvlan-eth0
nodeSelector:
nodeSelectorTerms:
- matchFields:
- key: metadata.name
operator: In
values:
- kind-worker
deviceStatuses:
- conditions: null
device: cni
deviceInfo:
cniVersion: 1.0.0
interfaces:
- mac: 1e:32:6c:b7:c9:66
name: net1
sandbox: /var/run/netns/cni-5b7c0846-7995-9450-f441-a177399d08d5
ips:
- address: 10.10.1.2/24
gateway: 10.10.1.1
interface: 0
driver: poc.dra.networking
networkDeviceInfo:
interface: net1
ips:
- 10.10.1.2/24
mac: 1e:32:6c:b7:c9:66
pool: kind-worker
request: macvlan-eth0
reservedFor:
- name: demo-a
resource: pods
uid: 2bd46adf-b478-4e25-9e37-828539799169
The Networking DRA Driver is running in Containerd, so the NRI plugin required in previous PoCs (LionelJouin/network-dra / aojea/kubernetes-network-driver) is no longer required. However, Containerd now requires Kubernetes API access in order to get the ResourceClaims (on NodePrepareResources, step 1 in the flow picture) and to update the ResourceClaims Status (after CNI Add, step 5 in the flow picture).
This PoC uses the kubelet kubeconfig to access the API (Status update should be allowed from kubelet access in that case). In Kind, Containerd starts before kubelet, so this PoC keeps retrying to get the kubeconfig from a goroutine. Once the kubeconfig is retrieved, Containerd will also register itself as DRA plugin (Status could be improved to advertise the availability of the networking DRA Driver?).
When a pod is created, its default primary network will be set up and the other networks will be set up right after.
Highlighted with the aojea/kubernetes-network-driver PoC, a DRA Driver for Networking could be created. NodePrepareResources would retrieve the Resources Claims to be used, store them, so when the function to add the networks is called (on RunPodSandbox), the Resource Claims are already known and can be easily retrieved to add the networks to the pod and update the status.
Clone Kind
git clone [email protected]:kubernetes-sigs/kind.git
Build Kind base image
make -C images/base quick EXTRA_BUILD_OPT="--build-arg CONTAINERD_CLONE_URL=https://github.com/LionelJouin/containerd --build-arg CONTAINERD_VERSION=dra-cni --no-cache" TAG=dra-cni
Clone the Kubernetes fork
git clone [email protected]:kubernetes/kubernetes.git
cd kubernetes
git remote add LionelJouin [email protected]:LionelJouin/kubernetes.git
git fetch LionelJouin
git checkout LionelJouin/dra-device-status
Build Kind image
kind build node-image . --image kindest/node:dra-cni-status --base-image gcr.io/k8s-staging-kind/base:dra-cni
Kind Cluster config:
---
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
featureGates:
"DynamicResourceAllocation": true
"DRAControlPlaneController": true
runtimeConfig:
"resource.k8s.io/v1alpha3": true
kubeadmConfigPatches:
- |
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
logging:
verbosity: 10
- |
kind: ClusterConfiguration
apiServer:
extraArgs:
v: "4"
scheduler:
extraArgs:
v: "4"
controllerManager:
extraArgs:
v: "4"
containerdConfigPatches:
- |-
[plugins."io.containerd.grpc.v1.cri"]
enable_cdi = true
[plugins.'io.containerd.grpc.v1.cri'.cni]
cni_dra = true
nodes:
- role: control-plane
image: kindest/node:dra-cni-status
- role: worker
image: kindest/node:dra-cni-status
Install CNI Plugins:
kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/multus-cni/master/e2e/templates/cni-install.yml.j2
Apply ResourceSlice:
cat <<EOF | kubectl apply -f -
---
apiVersion: resource.k8s.io/v1alpha3
kind: ResourceSlice
metadata:
name: kind-worker-poc-dra-networking
spec:
devices:
- name: cni
basic:
attributes:
name:
string: "eth0"
driver: poc.dra.networking
nodeName: kind-worker
pool:
name: kind-worker
resourceSliceCount: 1
EOF
Apply DeviceClass:
cat <<EOF | kubectl apply -f -
---
apiVersion: resource.k8s.io/v1alpha3
kind: DeviceClass
metadata:
name: cni-v1
EOF
Apply ResourceClaim and Pod:
cat <<EOF | kubectl apply -f -
---
apiVersion: resource.k8s.io/v1alpha3
kind: ResourceClaim
metadata:
name: macvlan-eth0-attachment
spec:
devices:
requests:
- name: macvlan-eth0
deviceClassName: cni-v1
config:
- requests:
- macvlan-eth0
opaque:
driver: poc.dra.networking
parameters:
interface: "net1"
config: '{
"cniVersion": "1.0.0",
"name": "macvlan-eth0",
"plugins": [
{
"type": "macvlan",
"master": "eth0",
"mode": "bridge",
"ipam": {
"type": "host-local",
"ranges": [
[
{
"subnet": "10.10.1.0/24"
}
]
]
}
}
]
}'
---
apiVersion: v1
kind: Pod
metadata:
name: demo-a
spec:
containers:
- name: alpine
image: alpine:latest
imagePullPolicy: IfNotPresent
command:
- sleep
- infinity
resourceClaims:
- name: macvlan-eth0-attachment
resourceClaimName: macvlan-eth0-attachment
EOF
Verify the resource claim status:
kubectl get resourceclaims macvlan-eth0-attachment -o yaml
Verify the pod interfaces:
kubectl exec -it demo-a -- ip a
- Sig-Network Sync: https://docs.google.com/document/d/1_w77-zG_Xj0zYvEMfQZTQ-wPP4kXkpGD8smVtW_qqWM/edit
- MN Sync: https://docs.google.com/document/d/1pe_0aOsI35BEsQJ-FhFH9Z_pWQcU2uqwAnOx2NIx6OY/edit#heading=h.fo1yo94x96wg
- CNI Sync: https://hackmd.io/@squeed/cni-meeting-notes
- DRA Sync: https://docs.google.com/document/d/1qxI87VqGtgN7EAJlqVfxx86HGKEAc2A3SKru8nJHNkQ/edit#heading=h.tgg8gganowxq
- kubernetes/enhancements Issue#4817: kubernetes/enhancements#4817
- PoC LionelJouin/network-dra: https://github.com/LionelJouin/network-dra
- PoC aojea/kubernetes-network-driver: https://github.com/aojea/kubernetes-network-driver
- CNI 2.0 Inital Document: https://docs.google.com/document/d/1Sxe0B3ZiqQBVL4Tn-O-xaD5x4YMoeuHt9GCOoFFqyc4/edit#heading=h.53w1b8oxqel8