Skip to content

Instantly share code, notes, and snippets.

@soeirosantos
Last active December 17, 2022 12:09
Show Gist options
  • Save soeirosantos/a349d96f216cd94822087de70068b326 to your computer and use it in GitHub Desktop.
Save soeirosantos/a349d96f216cd94822087de70068b326 to your computer and use it in GitHub Desktop.
Docker, Linux, containers, containerized processes, namespaces, cgroups etc

The real nature of containers (:

In this short experiment, we are going to verify how containers really work. We'll check that containers are nothing else than Linux processes running on a host machine. These processes are isolated from the host machine and from each other by Linux namespaces and they also have their resources constrained/limited by control groups also know as cgroups - a Linux kernel feature that allows processes to be organized into hierarchical groups whose usage of various types of resources can then be limited and monitored.

Note: To follow along with this experiment you need to run the commands on a Linux machine. If you are on Mac or Windows you can use Vagrant to quickly spin up a Ubuntu machine. In this gist I share a Vagrantfile that will start a Ubuntu VM and install Docker for you.

Let's start by creating a container in the host machine and connecting to its shell

$ vagrant@node1:~$ docker run -it --rm --name my-container busybox /bin/sh

Now from inside the container let's create a file and start a process

$ echo > my_file_xpto

$ sleep 10000
^Z[1]+  Stopped                    sleep 1000

$ ps
PID   USER     TIME  COMMAND
    1 root      0:00 /bin/sh
    7 root      0:00 sleep 1000
    8 root      0:00 ps

In a different terminal window, connected to the host machine and check that the process running "inside" the container is visible from the host but has a different pid. I'm quoting "inside" because as we will see there isn't really an inside.

$ vagrant@node1:~$ ps -C sleep
  PID TTY          TIME CMD
15962 pts/0    00:00:00 sleep

Ok, now what if we can find the file we created in the container from the host

$ vagrant@node1:~$ sudo find / -name my_file_xpto
/var/lib/docker/overlay2/caeb1bdf150b03733d7920def5866ec1dcfc38fe61176a89eca218e3bb088f2e/merged/my_file_xpto
/var/lib/docker/overlay2/caeb1bdf150b03733d7920def5866ec1dcfc38fe61176a89eca218e3bb088f2e/diff/my_file_xpto

This folder structure can be different depending on your Docker version. But what is really interesting to note is that, from the host, we can see processes running whithin the container and we can also see the container's file system.

Now let's finally check that the container is nothing else than a process running on the host machine. For that we will retrieve the container pid and apply on it, using nsenter, the same set of namespaces configuration that Docker has defined for the process.

$ vagrant@node1:~$ docker inspect my-container --format {{.State.Pid}}
15782
$ vagrant@node1:~$ sudo nsenter --target 15782 --mount --uts --ipc --net --pid /bin/sh
$

It give us a shell within the container. We can now check and interact with its file system, network, execute new process, etc. Compare these outputs with the host machine and the container we launched with docker run.

$ ls -l
drwxr-xr-x    2 root     root         12288 May 13 02:35 bin
drwxr-xr-x    5 root     root           360 May 17 02:33 dev
drwxr-xr-x    1 root     root          4096 May 17 02:33 etc
drwxr-xr-x    2 nobody   nogroup       4096 May 13 02:35 home
-rw-r--r--    1 root     root             1 May 17 03:34 my_file_xpto
dr-xr-xr-x  112 root     root             0 May 17 02:33 proc
drwx------    1 root     root          4096 May 17 03:34 root
dr-xr-xr-x   13 root     root             0 May 17 02:33 sys
drwxrwxrwt    2 root     root          4096 May 13 02:35 tmp
drwxr-xr-x    3 root     root          4096 May 13 02:35 usr
drwxr-xr-x    4 root     root          4096 May 13 02:35 var

$ ps
PID   USER     TIME  COMMAND
    1 root      0:00 /bin/sh
    7 root      0:00 sleep 1000
   13 root      0:00 /bin/sh
   15 root      0:00 ps

$ hostname
93245b7721ac

$ ifconfig
eth0      Link encap:Ethernet  HWaddr 02:42:AC:11:00:02
          inet addr:172.17.0.2  Bcast:172.17.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:28 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:2096 (2.0 KiB)  TX bytes:0 (0.0 B)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

Now lets take a quick look at the cgroups. Lets start by starting the container again by limiting the memory (you need to exit all the running container terminals).

$ vagrant@node1:~$ docker run -it --rm --memory 300m --name my-container busybox /bin/sh

We can see the cgroups set for this contanier from the host

$ vagrant@node1:~$ docker inspect my-container --format {{.Id}}
d0278d99b66ae7582d8975c8058ba7cc239ff422f392e0ec5df1c5de9ae370cc

$ vagrant@node1:~$ cat /sys/fs/cgroup/memory/docker/d0278d99b66ae7582d8975c8058ba7cc239ff422f392e0ec5df1c5de9ae370cc/memory.limit_in_bytes | numfmt --to=iec-i
500Mi

Now from inside the container lets check the memory

$ free -m
              total        used        free      shared  buff/cache   available
Mem:            985         250         139           5         594         650
Swap:           979           0         979

Why doesn't it show the 500m value we defined when we started the container? What's gonna happen when your container reaches the limit? I won't answer that question 😄 I'll let that for you to check here.

Recommended reading

Vagrant.configure("2") do |config|
config.vm.box = "bento/ubuntu-18.04"
config.vm.hostname = "node1"
config.vm.provision "shell", inline: <<-SHELL
# https://docs.docker.com/engine/install/ubuntu/
apt-get update
apt-get install -y \
apt-transport-https \
ca-certificates \
curl \
gnupg-agent \
software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
apt-key fingerprint 0EBFCD88
add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
apt-get update
apt-get install -y docker-ce docker-ce-cli containerd.io
# https://docs.docker.com/engine/install/linux-postinstall/
usermod -aG docker vagrant
SHELL
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment