Demystifying Docker Volumes for Mac and PC Users

Docker runs on a Linux kernel

Docker can be confusing to PC and Windows users because many tutorials on that topic assume you're using a Linux machine.

As a Linux user, you learn that Volumes are stored in a part of the host filesystem managed by Docker, and that is /var/lib/docker/volumes. When you're running Docker on a Windows or Mac OS machine, you will read the same documentation and instructions but feel frustrated as that path don't exist on your system. This simple note is my answer to that.

When you use Docker on a Windows PC, you're typically doing one of these two things:

Run Linux containers in a full Linux VM (what Docker typically does today)
Run Linux containers with Hyper-V isolation

In the first option, when you bind mounting volumes using docker run -v, the files are stored on the Windows NTFS filesystem and there are noted incompatibilities for many popular services. This is what Microsoft documentation says:

These applications all require volume mapping and will not start or run correctly.

MySQL

PostgreSQL

WordPress

Jenkins

MariaDB

RabbitMQ

When using Docker for Mac, you're actually running an instance of Alpine Linux through a lightweight virtualization layer. The hypervisor provides a filesystem and network sharing that is more "Mac-native".

Why does this matter? Because majority of tutorials online do not take them into account when showing code snippets and explanations on Docker Volumes.

Docker Volumes and Storage

Following the official documentations, you will learn that Docker stores data on a local file system by creating this directory structure under /var/lib/docker. This is where docker store all its data (files related to images and containers running on the host):

/var/lib/docker
  /aufs
  /containers
  /image
  /volumes

Any volumes created are stored under volumes:

# the following command:
docker volume create data_volume
# creates the following directory
/var/lib/docker
  /volumes
    / data_volume

Now this is where the confusion begins. Non-Linux users would try and cd to the path provided by said documentation or tutorial and couldn't find it, resulting in threads like Link 1 and Link 2

As it turns out, you will need to get into the Docker VM on your machine. I'll provide the example for a Mac user.

Supposed I run the mysql service and inspect my volumes I will find the following:

docker run -v data_volume:/var/lib/mysql mysql
docker volume ls
DRIVER              VOLUME NAME
local               2dc8364
local               7909f81
...                 ...
local               data_volume

Now open up a second terminal and connect to tty on Docker VM using this (Mac):

screen ~/Library/Containers/com.docker.docker/Data/vms/0/tty

tty is short for teletype, but known today as the terminal. The screen command is designed to offer user the ability to use multiple terminal sessions from a single console. When the session is detached, the process that continues and user can reattach to the screen session later. We use Screen to connect to the Docker VM's terminal (tty) in the command above.

You will now be in the Docker VM:

pwd
# returns: /var/lib/docker/volumes
uname -r
# returns: 4.9.184-linuxkit
ls
# returns:
# 2dc8364
# 7909f81
# ...
# data_volume

Pro-tip: Do not open a second terminal tab to connect to the tty as you will just see garbled text. Detach from the linux screen session using Ctrl-a + d - this will keep the screen session active so you can reattach to it later using screen -r. Use screen -ls to list multiple screens. To kill this session and exit use Ctrl-a + k.

Default location varies by services

Not specific to Mac or Windows users, but knowing where the default location of your services store its data is important to configuring your volume mount.

For example, mysql by default store its data in /var/lib/mysql and if we wish to mount that volume to the /data_volume folder we created in step (1), we could do the following:

docker run -v data_volume:/var/lib/mysql mysql

Now all data created by the mysql service will be mounted onto data_volume on the docker host, such that even when the container is destroyed the data is still persisted in that volume. To fully inspect data_volume, follow the instruction in step (2). Other services will have different default so read the documentation thoroughly. Postgres for example store its database files in /var/lib/postgresql/data, so your Dockerfile or docker-compose.yaml file will have the following configuration instead:

volumes:
  - ./postgres-data:/var/lib/postgresql/data

Default to volume mounting, not bind mounts

You can create the volume explicitly using docker volume create db_vol or implicitly:

When you use provide a docker-compose with a volume mount configuration
When you docker run -v db_vol:/var/lib/mysql mysql and db_vol don't yet exist (not created using the explicit commands docker volume create)

The above options (both explicit and implicit) create the directory under /var/lib/docker:

/var/lib/docker
  /volumes
    /db_vol

This is a different concept from bind mounting. Consider the case where you have your data already persistent on some other storage location on the Docker host (in this example: /data/) that is not in the default /var/lib/docker directory. You can provide the full path when doing the mount. The code snippet show the difference in that two types of mounting:

# volume mounts (default to /var/lib/docker on Docker host)
docker run -v db_vol:/var/lib/mysql mysql
# bind mounts
docker run -v /data/mysql:/var/lib/mysql mysql

Volume mount: Mounts the volume from the volumes directory. Best way to persist data.
Bind mount: Mounts a directory from any location on the Docker host, even important system files or directories

When you create a volume, it is stored within a directory on the Docker host. When you mount the volume into a container, this directory is what is mounted into the container. This is similar to the way that bind mounts work, except that volumes are managed by Docker and are isolated from the core functionality of the host machine (since bind mount can mount a directory from any location on the host).

A volume can be mounted into multiple containers simultaneously and when it's not used by any container can be removed using docker volume prune.

When you use a bind mount, a file or directory on the host machine is mounted into a container. The file or directory is referenced by its full path on the host machine. They also rely on the host machine's filesystem having a specific directory structure available. Bind mounts also allow access to sensitive files on the host filesystem, including creating, modifying or deleting important system files or directories impacting non-Docker processes on the host system.

WaYdotNET/docker-volumes.md

Demystifying Docker Volumes for Mac and PC Users