Skip to content

Instantly share code, notes, and snippets.

@bergwolf
Last active August 10, 2018 12:23
Show Gist options
  • Save bergwolf/c07d8ad20a42baa068f5dee2d7ec4f9a to your computer and use it in GitHub Desktop.
Save bergwolf/c07d8ad20a42baa068f5dee2d7ec4f9a to your computer and use it in GitHub Desktop.
kata bootup speed analysis
  1. Even if we ignore the time consumed by kata-runtime, docker still uses on average 0.629s (0.5526s, 0.684s, 0.588s, 0.69s) to create a new container. So we can claim that docker/containerd architecture makes docker(runc) always slower than hypercontainer now.

  2. All above tests were done with kata persistent storage put in tmpfs. But it does not increate bootup speed. So kata runtime metadata plugin may not help performance either.

  3. The most time consuming operation is kata-runtime create and it needs further investigation.

  4. It needs further investigation on why vm templating is slower than plain system bootup. See LaunchingVM.md. It seems it costs a lot of time to create a kata-proxy process and want it to be ready (almost 1 second). Maybe builtin proxy and vsock can be better here.

  5. For vm template result, the gap between first create and first state (0.39s, 0.35s) also worth further investigation.

w/ vm template:

kata-runtime create -> 1.7s docker -> 0.39s kata-runtime state 0.009s docker -> 0.012s kata-runtime start 0.013s docker -> 0.01s kata-runtime state 0.004s

sum: 2.137s

[vmfarm@osbuilder]$time docker run -d --runtime kata busybox top 0abf1b20d64918a50c92936d911dcfb356a0860632ae7c161a8e85bb6d1ab89b

real 0m2.776s user 0m0.044s sys 0m0.016s

credits to docker: 2.2776 - 2.137 + 0.39 + 0.012 + 0.01 = 0.5526s


w/ vm template again:

kata-runtime create -> 1.666s docker -> 0.35s kata-runtime state 0.002s docker -> 0.006s kata-runtime start 0.007s docker -> 0.005s kata-runtime state 0.002s

sum: 2.047

[vmfarm@osbuilder]$time docker run -d --runtime kata busybox top 1b8b7c310182e52c1c115e787a5c5d06ec6fe9a2cd90de0b3fe8a763ca66404f

real 0m2.370s user 0m0.048s sys 0m0.012s

credits to docker: 2.370 - 2.047 + 0.35 + 0.006 + 0.005 = 0.684s


w/o vm template:

kata-runtime create -> 1.328s docker -> 0.012s kata-runtime state -> 0.002s docker -> 0.006s kata-runtime start -> 0.0057s docker -> 0.005s kata-runtime state -> 0.002s

sum: 1.136s

[vmfarm@osbuilder]$time docker run -d --runtime kata busybox top 9c5d1a3698904e3a9c716b320fbf29c09e296dcf4d45b63f03ea90faf7bd8f5b

real 0m1.701s user 0m0.032s sys 0m0.028s

credits to docker: 1.701 - 1.136 + 0.012 + 0.006 + 0.005 = 0.588s


w/o vm tempalte again:

kata-runtime create -> 1.308s docker -> 0.014s kata-runtime state -> 0.002s docker -> 0.005s kata-runtime start -> 0.0059s docker -> 0.005s kata-runtime state -> 0.002s

sum: 1.343

[vmfarm@osbuilder]$time docker run -d --runtime kata busybox top d1e992ba3948e178ba624cee388aebdf7396503696b4a6d66cfed403e2715018

real 0m2.009s user 0m0.048s sys 0m0.020s

credits to docker: 2.009 - 1.343 + 0.014 + 0.005 + 0.005 = 0.69s

w/ vm template

from launching qemu to first grpc via the proxy, it costs 1.459s

First connection made from runtime via proxy to agent, costs almost 1 second:

time="2018-08-03T21:40:09.965992008+08:00" level=info msg="New client" arch=amd64 command=create container=1b8b7c310182e52c1c115e787a5c5d06ec6fe9a2cd90de0b3fe8a763ca66404f name=kata-runtime pid=27749 source=virtcontainers subsystem=kata_agent url="unix:///run/vc/sbs/1b8b7c310182e52c1c115e787a5c5d06ec6fe9a2cd90de0b3fe8a763ca66404f/proxy.sock" time="2018-08-03T21:40:10.96781241+08:00" level=debug msg="sending request" arch=amd64 command=create container=1b8b7c310182e52c1c115e787a5c5d06ec6fe9a2cd90de0b3fe8a763ca66404f name=grpc.CheckRequest pid=27749 req= source=virtcontainers subsystem=kata_agent

Between lauching qemu and making the first connection:

  1. resume vm -> 0.043s
  2. reseed rng -> 0.348s

And VM/agent is fully up and running before proxy is spawned.

w/o vm template

from launching qemu to first grpc via the proxy, it costs 1.073s

First connection made from runtime via proxy to agent, costs almost 1 second:

time="2018-08-03T21:32:48.541164012+08:00" level=info msg="New client" arch=amd64 command=create container=d1e992ba3948e178ba624cee388aebdf7396503696b4a6d66cfed403e2715018 name=kata-runtime pid=27073 source=virtcontainers subsystem=kata_agent url="unix:///run/vc/sbs/d1e992ba3948e178ba624cee388aebdf7396503696b4a6d66cfed403e2715018/proxy.sock" time="2018-08-03T21:32:49.542341235+08:00" level=debug msg="sending request" arch=amd64 command=create container=d1e992ba3948e178ba624cee388aebdf7396503696b4a6d66cfed403e2715018 name=grpc.CheckRequest pid=27073 req= source=virtcontainers subsystem=kata_agent

Almost no operation in between.

Reasoning

w/o vm template, it parallelize two things:

  1. wait vm/agent to be fully up and working
  2. spawn the proxy

It seems that it might also cost a lot of time to spawn the proxy and wait it to be ready.

It appears that proxy takes one second to accept the connect request from runtime or vm template:

time="2018-08-10T16:38:11.477261815+08:00" level=info msg="proxy started" arch=amd64 command=create container=f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8 name=kata-runtime pid=4777 proxy-pid=4823 proxy-url="unix:///run/vc/sbs/f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8/proxy.sock" sandbox=f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8 source=virtcontainers subsystem=kata_agent
time="2018-08-10T16:38:11.477600864+08:00" level=info msg="New client" arch=amd64 command=create container=f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8 name=kata-runtime pid=4777 source=virtcontainers subsystem=kata_agent url="unix:///run/vc/sbs/f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8/proxy.sock"
time="2018-08-10T16:38:11.483032743+08:00" level=info msg=announce agent-logs-socket=/run/vc/vm/f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8/console.sock channel-mux-socket=/run/vc/vm/f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8/kata.sock debug=true log-level=debug name=kata-proxy pid=4823 proxy-listen-socket="unix:///run/vc/sbs/f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8/proxy.sock" sandbox=f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8 source=proxy version=1.2.0-17962180fc61c66066905546bfd5d8933bf73df1-dirty
time="2018-08-10T16:38:11.483725524+08:00" level=info msg="dailing server" channel=/run/vc/vm/f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8/kata.sock name=kata-proxy pid=4823 sandbox=f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8 source=proxy
time="2018-08-10T16:38:11.483933052+08:00" level=info msg="server connected" channel=/run/vc/vm/f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8/kata.sock name=kata-proxy pid=4823 sandbox=f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8 source=proxy
time="2018-08-10T16:38:11.48405896+08:00" level=info msg="proxy listens" name=kata-proxy pid=4823 sandbox=f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8 source=proxy
time="2018-08-10T16:38:11.484356907+08:00" level=info msg="proxy accepts" name=kata-proxy pid=4823 sandbox=f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8 source=proxy
time="2018-08-10T16:38:11.484678769+08:00" level=info msg="time=\"2018-08-10T08:37:45.397715551Z\" level=debug msg=\"New serial channel event\" events=4 name=kata-agent pid=1 source=agent\n" name=kata-proxy pid=4823 sandbox=f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8 source=agent
time="2018-08-10T16:38:11.484921855+08:00" level=info msg="time=\"2018-08-10T08:37:45.398086636Z\" level=info msg=\"agent grpc server listens\" name=kata-agent pid=1 source=agent\n" name=kata-proxy pid=4823 sandbox=f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8 source=agent
time="2018-08-10T16:38:11.485203807+08:00" level=info msg="time=\"2018-08-10T08:37:45.398388845Z\" level=info msg=\"agent grpc server serves\" name=kata-agent pid=1 source=agent\n" name=kata-proxy pid=4823 sandbox=f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8 source=agent
time="2018-08-10T16:38:12.478959875+08:00" level=info msg="New client connected" arch=amd64 command=create container=f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8 name=kata-runtime pid=4777 source=virtcontainers subsystem=kata_agent url="unix:///run/vc/sbs/f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8/proxy.sock"
time="2018-08-10T16:38:12.478913189+08:00" level=info msg="proxy opens stream" name=kata-proxy pid=4823 sandbox=f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8 source=proxy
time="2018-08-10T16:38:12.480260405+08:00" level=info msg="proxy starts copy goroutine" name=kata-proxy pid=4823 sandbox=f82471d252fb9b397167046ee1273484bcbbfc69c59a99a46f0be383e6b624b8 source=proxy

Or 500ms for non-template case:

time="2018-08-10T16:08:25.729481644+08:00" level=info msg="New client" arch=amd64 command=create container=a2d28570fe0467e1f981c00a1ebfc15c879b234cfcd11f830d6ee32efc351f94 name=kata-runtime pid=3708 source=virtcontainers subsystem=kata_agent url="unix:///run/vc/sbs/a2d28570fe0467e1f981c00a1ebfc15c879b234cfcd11f830d6ee32efc351f94/proxy.sock"
time="2018-08-10T16:08:25.730813887+08:00" level=info msg=announce agent-logs-socket=/run/vc/vm/a2d28570fe0467e1f981c00a1ebfc15c879b234cfcd11f830d6ee32efc351f94/console.sock channel-mux-socket=/run/vc/vm/a2d28570fe0467e1f981c00a1ebfc15c879b234cfcd11f830d6ee32efc351f94/kata.sock debug=true log-level=debug name=kata-proxy pid=3757 proxy-listen-socket="unix:///run/vc/sbs/a2d28570fe0467e1f981c00a1ebfc15c879b234cfcd11f830d6ee32efc351f94/proxy.sock" sandbox=a2d28570fe0467e1f981c00a1ebfc15c879b234cfcd11f830d6ee32efc351f94 source=proxy version=1.2.0-17962180fc61c66066905546bfd5d8933bf73df1-dirty
time="2018-08-10T16:08:25.73099507+08:00" level=info msg="dailing server" channel=/run/vc/vm/a2d28570fe0467e1f981c00a1ebfc15c879b234cfcd11f830d6ee32efc351f94/kata.sock name=kata-proxy pid=3757 sandbox=a2d28570fe0467e1f981c00a1ebfc15c879b234cfcd11f830d6ee32efc351f94 source=proxy
time="2018-08-10T16:08:25.731057075+08:00" level=info msg="server connected" channel=/run/vc/vm/a2d28570fe0467e1f981c00a1ebfc15c879b234cfcd11f830d6ee32efc351f94/kata.sock name=kata-proxy pid=3757 sandbox=a2d28570fe0467e1f981c00a1ebfc15c879b234cfcd11f830d6ee32efc351f94 source=proxy
time="2018-08-10T16:08:25.731092518+08:00" level=info msg="proxy listens" name=kata-proxy pid=3757 sandbox=a2d28570fe0467e1f981c00a1ebfc15c879b234cfcd11f830d6ee32efc351f94 source=proxy
time="2018-08-10T16:08:25.731172814+08:00" level=info msg="proxy accepts" name=kata-proxy pid=3757 sandbox=a2d28570fe0467e1f981c00a1ebfc15c879b234cfcd11f830d6ee32efc351f94 source=proxy
time="2018-08-10T16:08:26.149533973+08:00" level=info msg="time=\"2018-08-10T08:08:26.140198534Z\" level=info msg=\"Enable grpc tracing\" name=kata-agent pid=1 source=agent\n" name=kata-proxy pid=3757 sandbox=a2d28570fe0467e1f981c00a1ebfc15c879b234cfcd11f830d6ee32efc351f94 source=agent
time="2018-08-10T16:08:26.14980315+08:00" level=info msg="time=\"2018-08-10T08:08:26.140501942Z\" level=info msg=\"agent grpc server starts\" name=kata-agent pid=1 source=agent\n" name=kata-proxy pid=3757 sandbox=a2d28570fe0467e1f981c00a1ebfc15c879b234cfcd11f830d6ee32efc351f94 source=agent
time="2018-08-10T16:08:26.149984725+08:00" level=info msg="time=\"2018-08-10T08:08:26.140740316Z\" level=info msg=\"agent grpc server waits\" name=kata-agent pid=1 source=agent\n" name=kata-proxy pid=3757 sandbox=a2d28570fe0467e1f981c00a1ebfc15c879b234cfcd11f830d6ee32efc351f94 source=agent
time="2018-08-10T16:08:26.150133795+08:00" level=info msg="time=\"2018-08-10T08:08:26.140841054Z\" level=debug msg=\"New serial channel event\" events=4 name=kata-agent pid=1 source=agent\n" name=kata-proxy pid=3757 sandbox=a2d28570fe0467e1f981c00a1ebfc15c879b234cfcd11f830d6ee32efc351f94 source=agent
time="2018-08-10T16:08:26.150290004+08:00" level=info msg="time=\"2018-08-10T08:08:26.14098357Z\" level=info msg=\"agent grpc server listens\" name=kata-agent pid=1 source=agent\n" name=kata-proxy pid=3757 sandbox=a2d28570fe0467e1f981c00a1ebfc15c879b234cfcd11f830d6ee32efc351f94 source=agent
time="2018-08-10T16:08:26.150432898+08:00" level=info msg="time=\"2018-08-10T08:08:26.141137014Z\" level=info msg=\"agent grpc server serves\" name=kata-agent pid=1 source=agent\n" name=kata-proxy pid=3757 sandbox=a2d28570fe0467e1f981c00a1ebfc15c879b234cfcd11f830d6ee32efc351f94 source=agent
time="2018-08-10T16:08:26.730364986+08:00" level=info msg="proxy opens stream" name=kata-proxy pid=3757 sandbox=a2d28570fe0467e1f981c00a1ebfc15c879b234cfcd11f830d6ee32efc351f94 source=proxy

Some backoff retry policy from golang grpc dialer?

OK, I found it. It's the base delay of the grpc dialer's default backoff strategy, -- for the very first backoff retry, delay 1 second. Sadly, the grpc specification does not allow retrying more frequent than the default 1 second first failure backoff, and as a result, golang grpc implementation does not allow changing the default backoff strategy.

https://github.com/grpc/grpc/blob/master/doc/connection-backoff.md

With specific parameters of MIN_CONNECT_TIMEOUT = 20 seconds INITIAL_BACKOFF = 1 second MULTIPLIER = 1.6 MAX_BACKOFF = 120 seconds JITTER = 0.2

Implementations with pressing concerns (such as minimizing the number of wakeups on a mobile phone) may wish to use a different algorithm, and in particular different jitter logic.

Alternate implementations must ensure that connection backoffs started at the same time disperse, and must not attempt connections substantially more often than the above algorithm.

With kata-containers/agent#321, kata containers creation w/o vm templating is around 1 second, which is close to docker/runc (0.9 second). However when vm templating is enabled, it becomes 1.6~1.7 second, although there is no delay in proxy connection. Need more investigation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment