- Docker runs on a Linux kernel
Docker can be confusing to PC and Windows users because many tutorials on that topic assume you're using a Linux machine.
As a Linux user, you learn that Volumes are stored in a part of the host filesystem managed by Docker, and that is /var/lib/docker/volumes
. When you're running Docker on a Windows or Mac OS machine, you will read the same documentation and instructions but feel frustrated as that path don't exist on your system. This simple note is my answer to that.
When you use Docker on a Windows PC, you're typically doing one of these two things:
- Run Linux containers in a full Linux VM (what Docker typically does today)
- Run Linux containers with Hyper-V isolation
In the first option, when you bind mounting volumes using docker run -v
, the files are stored on the Windows NTFS filesystem and there are noted incompatibilities for many popular services. This is what Microsoft documentation says:
These applications all require volume mapping and will not start or run correctly.
- MySQL
- PostgreSQL
- WordPress
- Jenkins
- MariaDB
- RabbitMQ
When using Docker for Mac, you're actually running an instance of Alpine Linux through a lightweight virtualization layer. The hypervisor provides a filesystem and network sharing that is more "Mac-native".
Why does this matter? Because majority of tutorials online do not take them into account when showing code snippets and explanations on Docker Volumes.
- Docker Volumes and Storage
Following the official documentations, you will learn that Docker stores data on a local file system by creating this directory structure under /var/lib/docker
. This is where docker store all its data (files related to images and containers running on the host):
/var/lib/docker
/aufs
/containers
/image
/volumes
Any volumes created are stored under volumes
:
# the following command:
docker volume create data_volume
# creates the following directory
/var/lib/docker
/volumes
/ data_volume
Now this is where the confusion begins. Non-Linux users would try and cd
to the path provided by said documentation or tutorial and couldn't find it, resulting in threads like Link 1 and Link 2
As it turns out, you will need to get into the Docker VM on your machine. I'll provide the example for a Mac user.
Supposed I run the mysql
service and inspect my volumes I will find the following:
docker run -v data_volume:/var/lib/mysql mysql
docker volume ls
DRIVER VOLUME NAME
local 2dc8364
local 7909f81
... ...
local data_volume
Now open up a second terminal and connect to tty on Docker VM using this (Mac):
screen ~/Library/Containers/com.docker.docker/Data/vms/0/tty
tty
is short for teletype, but known today as the terminal. The screen
command is designed to offer user the ability to use multiple terminal sessions from a single console. When the session is detached, the process that continues and user can reattach to the screen session later. We use Screen to connect to the Docker VM's terminal (tty
) in the command above.
You will now be in the Docker VM:
pwd
# returns: /var/lib/docker/volumes
uname -r
# returns: 4.9.184-linuxkit
ls
# returns:
# 2dc8364
# 7909f81
# ...
# data_volume
Pro-tip: Do not open a second terminal tab to connect to the tty as you will just see garbled text. Detach from the linux screen session using Ctrl-a + d
- this will keep the screen session active so you can reattach to it later using screen -r
. Use screen -ls
to list multiple screens. To kill this session and exit use Ctrl-a + k
.
- Default location varies by services
Not specific to Mac or Windows users, but knowing where the default location of your services store its data is important to configuring your volume mount.
For example, mysql
by default store its data in /var/lib/mysql
and if we wish to mount that volume to the /data_volume
folder we created in step (1), we could do the following:
docker run -v data_volume:/var/lib/mysql mysql
Now all data created by the mysql
service will be mounted onto data_volume
on the docker host, such that even when the container is destroyed the data is still persisted in that volume. To fully inspect data_volume
, follow the instruction in step (2). Other services will have different default so read the documentation thoroughly. Postgres for example store its database files in /var/lib/postgresql/data
, so your Dockerfile or docker-compose.yaml file will have the following configuration instead:
volumes:
- ./postgres-data:/var/lib/postgresql/data
- Default to volume mounting, not bind mounts
You can create the volume explicitly using docker volume create db_vol
or implicitly:
- When you use provide a docker-compose with a volume mount configuration
- When you
docker run -v db_vol:/var/lib/mysql mysql
anddb_vol
don't yet exist (not created using the explicit commandsdocker volume create
)
The above options (both explicit and implicit) create the directory under /var/lib/docker
:
/var/lib/docker
/volumes
/db_vol
This is a different concept from bind mounting. Consider the case where you have your data already persistent on some other storage location on the Docker host (in this example: /data/
) that is not in the default /var/lib/docker
directory. You can provide the full path when doing the mount. The code snippet show the difference in that two types of mounting:
# volume mounts (default to /var/lib/docker on Docker host)
docker run -v db_vol:/var/lib/mysql mysql
# bind mounts
docker run -v /data/mysql:/var/lib/mysql mysql
- Volume mount: Mounts the volume from the
volumes
directory. Best way to persist data. - Bind mount: Mounts a directory from any location on the Docker host, even important system files or directories
When you create a volume, it is stored within a directory on the Docker host. When you mount the volume into a container, this directory is what is mounted into the container. This is similar to the way that bind mounts work, except that volumes are managed by Docker and are isolated from the core functionality of the host machine (since bind mount can mount a directory from any location on the host).
A volume can be mounted into multiple containers simultaneously and when it's not used by any container can be removed using docker volume prune
.
When you use a bind mount, a file or directory on the host machine is mounted into a container. The file or directory is referenced by its full path on the host machine. They also rely on the host machine's filesystem having a specific directory structure available. Bind mounts also allow access to sensitive files on the host filesystem, including creating, modifying or deleting important system files or directories impacting non-Docker processes on the host system.
I have a question, what happens if I want to use a hard drive I have with 4Tb to store the data of a docker image, is there a way to do so?
Thanks!