Skip to content

Instantly share code, notes, and snippets.

@egernst
Last active March 14, 2017 20:08
Show Gist options
  • Save egernst/2615644cda637c45d4454cb147626d9e to your computer and use it in GitHub Desktop.
Save egernst/2615644cda637c45d4454cb147626d9e to your computer and use it in GitHub Desktop.
dpdk-debug

Baseline

Make necessary changes in vm.json/hypervisor.args to move from pc-lite to pc:

diff --git a/hypervisor.args b/hypervisor.args                                                                                               │root@d0d829266c89:/# 
-pc,accel=kvm,kernel_irqchip,nvdimm                                                                                                          │
+pc-lite,accel=kvm,kernel_irqchip,nvdimm  
diff --git a/vm.json b/vm.json
-                       "path": "/usr/share/clear-containers/vmlinuz-4.9.4-53.container",
+                       "path": "/usr/share/clear-containers/vmlinux.container",

Make necessary changes in cc-oci-runtime as a result of move from pc-lite to pc:

diff --git a/src/hypervisor.c b/src/hypervisor.c
-#define QEMU_FMT_DEVICE "driver=virtio-net-pci,bus=/pci-lite-host/pcie.0,addr=%x,netdev=%s"
+#define QEMU_FMT_DEVICE "driver=virtio-net-pci,bus=pci.0,addr=%x,netdev=%s"

diff --git a/src/networking.c b/src/networking.c
-       return g_strdup_printf("enp0s%d", index + PCI_OFFSET);
+       return g_strdup_printf("ens%d", index + PCI_OFFSET);

With these changes, we can boot a clear container with networking without issue. This is of course a hack, but works for enabling a clean baseline on pc machine type for OVS-DPDK testing.

Current state when enabling DPDK in the runtime:

As a first step, I tested not passing any network information to qemu and the hyperagent (don't update json passed to the pod-create, and don't add any network parameters to qemu argument list). When doing this, you can boot without issue as in case of baseline.

As a second step, you can simply add DPDK parameters to the qemu arguments list. You can do this either in the cc-oci-runtime, or by writing them directly into hypervisors.arg. Once making this change, we begin to see hangs, but with only an error being reported from dockerd (nothing from hyperstart, nor from the runtime nor from the proxy):

level=error msg="containerd: get exit status" error="containerd: process has not exited" id=d3458e742ee30a70f39235770283df06500123640823c21d5dcc060942560d7a pid=init systemPid=5778
level=debug msg="containerd: process exited" id=d3458e742ee30a70f39235770283df06500123640823c21d5dcc060942560d7a pid=init status=255 systemPid=5778
level=info msg="containerd: d3458e742ee30a70f39235770283df06500123640823c21d5dcc060942560d7a:init (pid 5778) has become an orphan, killing it"

Compared to the working baselines, these are the unique messages that are observed afterwhich the shutdown/kill path begins.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment