Docker is a tool for packaging and shipping apps. Based on the idea of a shipping container, it provides a standardized way for developers or system administrators to create lightweight images, or collections of images, for each element of an application, and then easily and quickly deploy the image. Since the image is standardized, it can be uniformly deployed on development or production, leading to a much simpler workflow, faster development time for the dev team, and lower management overhead for the ops team.
First, a quick overview of a few things Docker is:
- An open source tool that places a layer on top of Linux containers (cgroups and namespaces) to make it simple to package and ship complex apps
- A tool for creating a layered filesystem; each layer is versioned and can be shared across running instances, making for much more lightweight deployments
- A company behind the project, as well as a site called the "Docker Hub" for sharing containers
And, a few things Docker isn't:
- A virtual machine. Unlike a true VM, a docker container does not also require a host OS, meaning it's much slimmer than a real VM
- An infrastructure automation tool, like Puppet or Chef. Like those other tools, Docker is a major player in the DevOps space, but its focus is around running apps in a container environment, as opposed to representing a machine state.
This guide introduces the key ideas you'll use again and again in Docker, such as images, layers, containers, commits, tags, and so forth. The main things to understand include:
- an image is a specific state of a filesystem
- an image is composed of layers representing changes in the filesystem at various points in time; layers are a bit like the commit history of a git repository
- a container is a running process that is started based on an image
- you can change the state of the filesystem on a container and commit it to create a new image
- changes in memory / state are not committed -- only changes on the filesystem
Docker uses the git-style command format:
$ docker [OPTIONS] COMMAND [arg...]
The following table, taken from the "docker help" command, provides a quick summary of the commands.
Command | Description |
---|---|
attach | Attach to a running container |
build | Build an image from a Dockerfile |
commit | Create a new image from a container's changes |
cp | Copy files/folders from a container's filesystem to the host path |
diff | Inspect changes on a container's filesystem |
events | Get real time events from the server |
export | Stream the contents of a container as a tar archive |
history | Show the history of an image |
images | List images |
import | Create a new filesystem image from the contents of a tarball |
info | Display system-wide information |
inspect | Return low-level information on a container |
kill | Kill a running container |
load | Load an image from a tar archive |
login | Register or log in to the Docker registry server |
logs | Fetch the logs of a container |
port | Lookup the public-facing port that is NAT-ed to PRIVATE_PORT |
pause | Pause all processes within a container |
ps | List containers |
pull | Pull an image or a repository from a Docker registry server |
push | Push an image or a repository to a Docker registry server |
restart | Restart a running container |
rm | Remove one or more containers |
rmi | Remove one or more images |
run | Run a command in a new container |
save | Save an image to a tar archive |
search | Search for an image on the Docker Hub |
start | Start a stopped container |
stop | Stop a running container |
tag | Tag an image into a repository |
top | Lookup the running processes of a container |
unpause | Unpause a paused container |
version | Show the Docker version information |
wait | Block until a container stops, then print its exit code |
As you scan the list (which I've left in alphabetical order), you'll notice 3 key groups of commands:
- Commands related to managing images, such as images, build, save, rmi, and tag.
- Commands related to containers, such as run, ps, kill, restart, top, and pause.
- Commands related to the Docker hub, such as login, search, pull, and push. (More on this in the next chapter.)
This guide is organized along these same basic lines. First, we'll cover how to get Docker up and running on a Mac or Windows machine using a tool called boot2docker. (If you're using a Linux-based machine, you can skip this chapter.) Next, we'll talk about images and layers, and then talk about containers, which are the "running" instances of an image. Next, we'll talk about the Docker Hub, which is a place where you can post your images, as well as find images maintained by other Docker users.
Although the examples might be a bit contrived, the goal of the guide is to give you a quick introduction to the ideas you'll use again and again as you explore Docker in depth.
Docker's underlying containerization technology only works on Linux. To use it on a Mac or Windows, you must use a command-line tool called boot2docker that installs a Linux virtual machine (or "VM") on your system. If you're unfamiliar with VMs, they let you run a "guest" operating system (like Linux) on your "host" operating system (like a Mac or Windows). For all intents and purposes, the VM is another computer running "inside" your computer -- it has its own operating system, filesystem, and network that can piggyback on the resources of the host OS in an isolated environment.
So, the idea behind boot2docker is really quite clever: since we can't run Docker natively on Windows or a Mac, we install a bare-bones Linux VM that can run Docker, and then we communicate with it using a Docker client running on our host (i.e., the terminal on our main OS). If you've ever used a package like Postgres or MySql, it's exactly the same idea: you have a client tool that you use to issue commands to another server. The main difference is that our "docker" process is running inside a VM, rather than as a native service.
This chapter will walk you through the installation process. Since both Docker and boot2docker are changing so rapidly, rather than describe the detailed steps here, I've provided an overview of the process so that you'll understand what's happening.
Before you start, it's a good idea to first set up SSH on your host machine so that boot2docker can install your login credentials on the VM. Otherwise, you'll be prompted for a password whenever you log in. If you're unfamiliar with SSH or SSH keys, Appendix ??? will give you details of how to get set up.
First, download the installation package for your system. You can find Mac and Windows installation instructions on the Docker's documentation site. Next, you run the install procedure, which will install the boot2docker command line tool as well VirtualBox, a free tool from Oracle that allows your computer to run virtual machines.
Once the basic tools are installed, you're ready to create the box by running boot2docker init. (Think of this as purchasing a new computer.) This will create a new Linux "box" called "boot2docker-vm" on your machine that has Docker installed, as well as configure some of the basic things you'll need to communicate with it from your host. The command will produce something like this:
$ boot2docker init
2014/08/11 13:30:58 Creating VM boot2docker-vm...
2014/08/11 13:30:58 Apply interim patch to VM boot2docker-vm (https://www.virtualbox.org/ticket/12748)
2014/08/11 13:30:58 Setting NIC #1 to use NAT network...
2014/08/11 13:30:58 Port forwarding [ssh] tcp://127.0.0.1:2022 --> :22
2014/08/11 13:30:58 Port forwarding [docker] tcp://127.0.0.1:2375 --> :2375
2014/08/11 13:30:58 Setting NIC #2 to use host-only network "vboxnet4"...
2014/08/11 13:30:59 Setting VM storage...
2014/08/11 13:31:07 Done. Type `boot2docker up` to start the VM.
Once the box is created (it can take a few minutes because it has to download the image), you run boot2docker up to boot up the box. Think of this as turning your the new computer on. The box will start the Docker daemon in the VM that listens for requests from the our client on our host. Here's an example:
$ boot2docker up
2014/08/11 13:38:00 Waiting for VM to be started...
............
2014/08/11 13:38:36 Started.
2014/08/11 13:38:36 To connect the Docker client to the Docker daemon, please set:
2014/08/11 13:38:36 export DOCKER_HOST=tcp://192.168.59.104:2375
Finally, you'll need to set an environment variable called DOCKER_HOST that will tell your Docker client on your host machine the URI for the Docker daemon running on the VM. This address is the last line returned from the "boot2docker up" command. For example, here's what we'd run based on the output we got in the previous step:
$ export DOCKER_HOST=tcp://192.168.59.104:2375
Since running the export command in the shell will only set the environment variable temporarily, you should follow the process on your OS to make it permanent. On a Mac, add the following line to the ~/.bash_profile:
export DOCKER_HOST=tcp://$(boot2docker ip 2>/dev/null):2375
On windows, use the "environment variables" setting of the "Advanced system settings" tab
Once you've completed all these steps, you should have the Docker client installed and a VM capable of starting Docker containers. To test it out, try the command "docker run hello-world" from your host:
$ docker run hello-world
Unable to find image 'hello-world' locally
Pulling repository hello-world
565a9d68a73f: Pulling image (latest) from hello-world, endpoint: https://cdn-reg565a9d68a73f: Download complete
511136ea3c5a: Download complete
2505d942a91d: Download complete
Hello from Docker.
...
As you can see from the output, this command has pulled down the hello-world repository from (more on this in a later chapter) and printed the text "Hello from Docker." If you're seeing this message, then you've successfully installed boot2docker. Alternatively, if you get a message like the following, you should doublecheck that you've set the DOCKER_HOST environment variable correctly:
$ docker run hello-world
2014/08/11 15:05:55 Post http:///var/run/docker.sock/v1.13/containers/create: dial unix /var/run/docker.sock: no such file or directory
If you've installed an older version of boot2docker, you can (and should!) use boot2docker itself to update the package. The commands are:
$ boot2docker stop
$ boot2docker download
$ boot2docker start
boot2docker uses the increasingly common "git"-style format:
$ boot2docker [<options>] <command> [<args>]
For example, here's how to get the VM's current status:
$ boot2docker status
running
The following table, taken from "boot2docker help", summarizes the various commands.
Command | Description |
---|---|
init | Create a new boot2docker VM. You should only do this once to create a box. |
up | Starts the VM and the Docker daemon |
status | Get the current state of VM, such as "running," "saved," or "poweroff". |
ssh [ssh-command] | Start a shell on the VM using SSH. If you have not set up an SSH key, you'll be prompted for a password. (It's "tcuser") |
save / suspend | Suspend the VM and save state to disk. |
down / stop / halt | Gracefully shutdown the VM. |
restart | Gracefully reboot the VM. |
delete / destroy | Delete the boot2docker VM and its disk image. |
config / cfg | Displays selected settings for the VM, such as its memory size or IP address |
help | Displays the commands in this table |
info | A json dump of profile settings |
ip | The IP address of the VM's Host-only network; this can be used to set the DOCKER_HOST environment variable |
download | Download's the current boot2docker ISO image. (An ISO is an archive format based on old CDROM standard.) |
upgrade | Upgrade the boot2docker ISO image (if vm is running it will be stopped and started). |
version | Display the current version of boot2docker. It changes often, so be sure to update frequently! |
You can also provide a variety of options to further customize the VM, such as setting the amount of memory or the name of the box itself. Most of these options appear to only work when the image is created (i.e., they only apply then you're doing "boot2docker init"). The following table summarizes the options from "boot2docker help". (The actual output from the command shows the default settings for each option.)
Option | Description |
---|---|
--basevmdk | Path to VMDK to use as base for persistent partition |
--dhcp | enable VirtualBox host-only network DHCP. |
--dhcpip | VirtualBox host-only network DHCP server address. |
--disksize | boot2docker disk image size (in MB). |
--dockerport | host Docker port (forward to port 2375 in VM). |
--hostip | VirtualBox host-only network IP address. |
--iso | path to boot2docker ISO image. |
--lowerip | VirtualBox host-only network DHCP lower bound. |
--memory | virtual machine memory size (in MB). |
--netmask | VirtualBox host-only network mask. |
--serial | try serial console to get IP address (experimental) |
--serialfile | path to the serial socket/pipe. |
--ssh | path to SSH client utility. |
--ssh-keygen | path to ssh-keygen utility. |
--sshkey | path to SSH key to use. |
--sshport | host SSH port (forward to port 22 in VM). |
--upperip | VirtualBox host-only network DHCP upper bound. |
--vbm | path to VirtualBox management utility. |
--verbose | display verbose command invocations. |
--vm | virtual machine name. |
Here's an example of how you'd created an image that used 4GB, rather than the default 2GB.
$ boot2docker --memory=4096 init
2014/08/11 17:15:50 Creating VM boot2docker-vm...
2014/08/11 17:15:50 Apply interim patch to VM boot2docker-vm (https://www.virtualbox.org/ticket/12748)
2014/08/11 17:15:50 Setting NIC #1 to use NAT network...
2014/08/11 17:15:51 Port forwarding [ssh] tcp://127.0.0.1:2022 --> :22
2014/08/11 17:15:51 Port forwarding [docker] tcp://127.0.0.1:2375 --> :2375
2014/08/11 17:15:51 Setting NIC #2 to use host-only network "vboxnet4"...
2014/08/11 17:15:51 Setting VM storage...
2014/08/11 17:15:59 Done. Type `boot2docker up` to start the VM.
$ boot2docker info | python -m json.tool
{
"BaseFolder": "/Users/odewahn/VirtualBox/VMs/boot2docker-vm",
"BootOrder": null,
"CPUs": 4,
"CfgFile": "/Users/odewahn/VirtualBox/VMs/boot2docker-vm/boot2docker-vm.vbox",
"DockerPort": 2375,
"Flag": 0,
"Memory": 4096,
"Name": "boot2docker-vm",
"OSType": "",
"SSHPort": 2022,
"SerialFile": "/Users/odewahn/.boot2docker/boot2docker-vm.sock",
"State": "poweroff",
"UUID": "db24beb1-5b83-4744-9f5f-ce05b41eda91",
"VRAM": 8
}
By and large, boot2docker hides the nitty gritty details of working directly with VirtualBox, the underlying software that allows us to run the VMs. However, there's one particular situation -- opening a port from your VM to your host machine -- where you'll need to use the VirtualBox itself. The most common use case for doing this is when you are developing some custom app that is running within Docker, but you want to connect with it in your browser.
For example, say you're writing a Rails application. You might pull down the a Docker image that has a Ruby and Rails environment installed, start the local development server, and then try to connect to localhost:4567 to see the output, like you would in a "normal" environment. By default, however, boot2docker only exposes a few ports on the VM (port 2022 for SSH and port 2375 for the boot2docker API), so instead of seeing your app, you'll only see a "Cannot connect" page.
Fortunately, the fix for this is quite simple. VirtualBox has a command line utility called "VBoxManage" that you can use to customize the box as it runs. To expose a port, type something like this into your terminal:
$ VBoxManage controlvm boot2docker-vm natpf1 "rails-server,tcp,127.0.0.1,4567,,4567"
This will create a port forwarding rule that in VirtualBox that will last until you destroy the box. You can also create this rule in the VirtualBox GUI tool in the "Settings -> Network -> Port Forwarding" screen, as shows in the next figure.
You can find out more about this tool in the extensive VirtualBox documentation site.
A Docker image represents a snapshot of a filesystem at a certain point in time. As mentioned in the introduction, the image is composed of layers that progressively stack on top of each other; containers (running instances of an image) can share these layers among them, which is one reason Docker is so much lighter weight than a full VM, where nothing is generally shared.
Perhaps the best way to start (after you get Docker installed, of course!) is to use "docker pull" to grab the latest release of Ubuntu (we'll talk more about where you're actually pulling from in the chapter on the Docker Hub):
$ docker pull ubuntu
Pulling repository ubuntu
c4ff7513909d: Download complete
511136ea3c5a: Download complete
1c9383292a8f: Download complete
9942dd43ff21: Download complete
d92c3c92fa73: Download complete
0ea0d582fd90: Download complete
cc58e55aa5a5: Download complete
...
As you pull the image, you'll see the progress of each dependent layer being downloaded. Once all the layers are finished downloading (and there are a LOT of layers for Ubuntu and all the versions), you can run docker images
to get information about the images on your system. Here's an example:
$ docker images
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
ubuntu 14.04 c4ff7513909d 2 days ago 225.4 MB
ubuntu latest c4ff7513909d 2 days ago 225.4 MB
ubuntu 14.04.1 c4ff7513909d 2 days ago 225.4 MB
ubuntu trusty c4ff7513909d 2 days ago 225.4 MB
ubuntu 14.10 75204fdb260b 2 days ago 230.1 MB
ubuntu utopic 75204fdb260b 2 days ago 230.1 MB
ubuntu 12.04.5 822a01ae9a15 2 days ago 108.1 MB
ubuntu precise 822a01ae9a15 2 days ago 108.1 MB
ubuntu 12.04 822a01ae9a15 2 days ago 108.1 MB
As you can see, the command returns the following columns:
- REPOSITORY. The name of the repository, which in this case is "ubuntu".
- TAG. We'll talk more about tags in a bit, but tags are similar to those found in git or other version control systems, and represent a specific set point in the repositories' commit history. As you can see from the list, we've pulled down a bunch of different versions of ubuntu: 14.04, 14.10, 12.04, etc. Each of these versions is tagged with a version number, a name, and there's even a special tag called "latest" which represents the latest version.
- IMAGE ID. This is like the primary key for the image. Sometimes, such as when you commit a container without specifying a name or tag, the repository or the tag is <NONE>, but you can always refer to a specific image or container using its ID.
- CREATED. The date the repository was created, as opposed to when it was pulled. This can help you assess how "fresh" a particular build is. Docker appears to update their master images on a fairly frequent basis.
- VIRTUAL SIZE. The size of the image.
If you want a granular view of the layers in in an image, you can use docker history
:
$ docker history ubuntu:latest
IMAGE CREATED CREATED BY SIZE
c4ff7513909d 7 days ago /bin/sh -c #(nop) CMD [/bin/bash] 0 B
cc58e55aa5a5 7 days ago /bin/sh -c apt-get update && apt-get dist-upg 32.67 MB
0ea0d582fd90 7 days ago /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$/ 1.895 kB
d92c3c92fa73 7 days ago /bin/sh -c rm -rf /var/lib/apt/lists/* 0 B
9942dd43ff21 7 days ago /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic 194.5 kB
1c9383292a8f 7 days ago /bin/sh -c #(nop) ADD file:c1472c26527df28498 192.5 MB
511136ea3c5a 14 months ago 0 B
Each line in the history corresponds to a commit of the image's filesystem. The values in the SIZE column add up to the corresponding VIRTUAL SIZE column for the image in docker image
. (If you decide to double check this, remember to that the column has units, so be sure to convert all values to MB.)
There are a couple of key things to understand about the layers in a docker images:
- They can be reused. Docker keeps track of all the layers you've pulled. So, if two images happen to have a layer in common (for example, if two images are built form the same base box), Docker will reuse the common parts, and only pull the diffs.
- The layers are always additive, which can lead to really big sizes if you're not careful. For example, if you download a large file, make a commit, delete the file, and then make another commit, that large file will still be present in the layer history. We'll come back to this idea again, so don't worry if it doesn't make too much sense right now. Just remember that layers are always additive.
A container is a running image that you start with the docker run
command, like this:
$ docker run [OPTIONS] IMAGE [COMMAND] [ARG...]
The command consists of:
- OPTIONS. There are a LOT of options for the run command. We'll cover this in a bit more depth in the next shortly.
- IMAGE. The name of the image that the container should start from.
- COMMAND. The command that should run on the container when it starts.
Let's try a simple command to to start a new container using the latest version of Ubuntu. Once this container starts, you'll be at a bash shell where you can do cat /etc/os-release
to see the release information about the OS:
$ docker run -it ubuntu:latest /bin/bash
root@4afa46473802:/# cat /etc/os-release
NAME="Ubuntu"
VERSION="14.04.1 LTS, Trusty Tahr"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 14.04.1 LTS"
VERSION_ID="14.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
Next, type this command in your running container to make a simple change on the filesystem:
root@4afa46473802:/# echo "hello world" > /tmp/hello.txt
Now that you've verified the OS release and created a test file, here's a bit more detail about the options we've used in the RUN command:
- The OPTIONS "-it" tell Docker to make the container interactive and provide a tty (i.e., attach a terminal)
- The IMAGE is the "ubuntu:latest" image we pulled down. Note the repository:tag format, which is used throughout docker. By default, the "latest" tag is used, but if we were to use "ubuntu:precise" we'd be using Ubuntu 12.04.
- The COMMAND is "/bin/bash", which starts a shell where you can log in
The cool thing is that once you're in an interactive docker shell, you can do anything you want, such as installing packages (via apt-get), code bases (using git), or dependencies (using pip, gem, npm, or any other package manager du jour.) You can then commit the container at any point to create a new image that has your changes.
To see this in action, open a new terminal on your host machine and use "docker ps" command to see some information about the running container:
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4afa46473802 ubuntu:14.04 /bin/bash 17 hours ago Up 39 seconds jolly_perlman
This command will tell you:
- The CONTAINER_ID, a unique identifier you can use to refer to the container in other commands (this is kind of like a process id in Linux)
- The IMAGE that was used when the container started
- The COMMAND used to start the container
- The time the underlying image was CREATED
- The uptime STATUS
- The exposed ports (lot's more on this!)
- A human readable NAME that you can use in place of the ID (this is something you can also assign yourself, which we'll get into shortly.)
Now try the docker diff
command:
$ docker diff jolly_perlman
C /tmp
A /tmp/hello.txt
As you can see, this command returns the changes that have happened on the filesystem, which in our case is that we've changed the "/tmp" directory by adding a "hello.txt" file. Now let's commit the changes we've made into a new container:
$ docker commit -m "Adding hello world" jolly\_perlman hello\_docker
5af244124dd8656...
Taking another look at docker images
shows the effect; we now have a new image called "hello_docker" that includes our "/tmp/hello.txt" file. Taking a quick look at the container's history shows how our change added a whopping new 12 byte layer on top of the original layers of ubuntu:latest.
$ docker history hello_docker
IMAGE CREATED CREATED BY SIZE
5af244124dd8 18 hours ago /bin/bash 12 B
c4ff7513909d 2 days ago /bin/sh -c #(nop) CMD [/bin/bash] 0 B
cc58e55aa5a5 2 days ago /bin/sh -c apt-get update && apt-get dist-upg 32.67 MB
0ea0d582fd90 2 days ago /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$/ 1.895 kB
d92c3c92fa73 2 days ago /bin/sh -c rm -rf /var/lib/apt/lists/* 0 B
9942dd43ff21 2 days ago /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic 194.5 kB
1c9383292a8f 2 days ago /bin/sh -c #(nop) ADD file:c1472c26527df28498 192.5 MB
511136ea3c5a 14 months ago 0 B
Finally, we can kill our docker process like this:
$ docker kill jolly_perlman
jolly_perlman
If you do docker ps again, you'll see that the process has stopped.
The next chapter will provide much more detail on how to create your own containers and images using docker run, one of the most feature-rich commands in Docker. Taken directly from docker help, this table is a useful quick reference you can refer to as you go further with Docker.
Option | Description |
---|---|
-a, --attach=[] | Attach to stdin, stdout or stderr. |
-c, --cpu-shares=0 | CPU shares (relative weight) |
--cidfile="" | Write the container ID to the file |
--cpuset="" | CPUs in which to allow execution (0-3, 0,1) |
-d, --detach=false | Detached mode: Run container in the background, print new container id |
--dns=[] | Set custom dns servers |
--dns-search=[] | Set custom dns search domains |
-e, --env=[] | Set environment variables |
--entrypoint="" | Overwrite the default entrypoint of the image |
--env-file=[] | Read in a line delimited file of ENV variables |
--expose=[] | Expose a port from the container without publishing it to your host |
-h, --hostname="" | Container host name |
-i, --interactive=false | Keep stdin open even if not attached |
--link=[] | Add link to another container (name:alias) |
--lxc-conf=[] | (lxc exec-driver only) Add custom lxc options --lxc-conf="lxc.cgroup.cpuset.cpus = 0,1" |
-m, --memory="" | Memory limit (format: , where unit = b, k, m or g) |
--name="" | Assign a name to the container |
--net="bridge" | Set the Network mode for the container |
'bridge': creates a new network stack for the container on the docker bridge | |
'none': no networking for this container | |
'container:': reuses another container network stack | |
'host': use the host network stack inside the container. Note: the host mode gives the container full access to local system services such as D-bus and is therefore considered insecure. | |
-P, --publish-all=false | Publish all exposed ports to the host interfaces |
-p, --publish=[] | Publish a container's port to the host |
format: ip:hostPort:containerPort or ip::containerPort or hostPort:containerPort | |
(use 'docker port' to see the actual mapping) | |
--privileged=false | Give extended privileges to this container |
--rm=false | Automatically remove the container when it exits (incompatible with -d) |
--sig-proxy=true | Proxify received signals to the process (even in non-tty mode). SIGCHLD is not proxied. |
-t, --tty=false | Allocate a pseudo-tty |
-u, --user="" | Username or UID |
-v, --volume=[] | Bind mount a volume (e.g., from the host: -v /host:/container, from docker: -v /container) |
--volumes-from=[] | Mount volumes from the specified container(s) |
-w, --workdir="" | Working directory inside the container |
As we saw in the Docker Walkthrough chapter, the general Docker workflow is:
- start a container based on an image in a known state
- add things to the filesystem, such as packages, codebases, libraries, files, or anything else
- commit the changes as layers to make a new image
In the walkthrough, we took a very simple approach of just starting a container interactively, running the commands we wanted (like "apt-get install" and "pip install"), and then committing the container into a new image.
In this chapter, we'll look at a more robust way to build an image. Rather than just running commands and adding files with tools like wget, we'll put our instructions in a special file called the Dockerfile. A Dockerfile is similar in concept to the recipes and manifests found in infrastructure automation (IA) tools like Chef or Puppet.
Overall, a Dockerfile is much more stripped down than the IA tools, consisting of a single file with a DSL that has a handful of instructions. The format looks like this:
# Comment
INSTRUCTION arguments
The following table summarizes the instructions; many of these options map directly to option in the "docker run" command:
Command | Description |
---|---|
ADD | Copies a file from the host system onto the container |
CMD | The command that runs when the container starts |
ENTRYPOINT | |
ENV | Sets an environment variable in the new container |
EXPOSE | Opens a port for linked containers |
FROM | The base image to use in the build. This is mandatory and must be the first command in the file. |
MAINTAINER | An optional value for the maintainer of the script |
ONBUILD | A command that is triggered when the image in the Dockerfile is used as a base for another image |
RUN | Executes a command and save the result as a new layer |
USER | Sets the default user within the container |
VOLUME | Creates a shared volume that can be shared among containers or by the host machine |
WORKDIR | Set the default working directory for the container |
Once you've created a Dockerfile and added all your instructions, you can use it to build an image using the docker build
command. The format for this command is:
docker build [OPTIONS] PATH | URL | -
The build command results in a new image that you can start using docker run
, just like any other image. Each line in the Dockerfile will correspond to a layer in the images' commit history.
Perhaps the best way to understand a Dockerfile is to dive into an example. Let's take a look at the example we went through in our overview chapter and condense it into a Dockerfile:
#
# Super simple example of a Dockerfile
#
FROM ubuntu:latest
MAINTAINER Andrew Odewahn "[email protected]"
RUN apt-get update
RUN apt-get install -y python python-pip wget
RUN pip install Flask
ADD hello.py /home/hello.py
WORKDIR /home
As you can see, it's pretty straightforward: we start from "ubuntu:latest," install dependencies with the RUN
command, add our code file with the ADD
command, and then set the default directory for when the container starts. Once we have a Dockerfile itself, we can build an image using docker build
, like this:
$ docker build -t "simple_flask:dockerfile" .
The "-t" flag adds a tag to the image so that it gets a nice repository name and tag. Also note the final ".", which tells Docker to use the Dockerfile in the current directory. Once you start the build, you'll see it churn away for a while installing things, and when it completes, you'll have a brand new image. Running docker history
will show you the effect of each command has on the overall size of the file:
$ docker history simple_flask:dockerfile
IMAGE CREATED CREATED BY SIZE
9ada423c0a60 3 days ago /bin/sh -c #(nop) WORKDIR /home 0 B
5c3625267cd9 3 days ago /bin/sh -c #(nop) ADD file:96e699cd177f1a3f3c 163 B
9c20a6548fbe 3 days ago /bin/sh -c pip install Flask 4.959 MB
7195370ae6e1 3 days ago /bin/sh -c apt-get install -y python python-p 136.1 MB
761bf82875cc 3 days ago /bin/sh -c apt-get update 19.94 MB
40b29df1d2c2 3 days ago /bin/sh -c #(nop) MAINTAINER Andrew Odewahn " 0 B
c4ff7513909d 9 days ago /bin/sh -c #(nop) CMD [/bin/bash] 0 B
cc58e55aa5a5 9 days ago /bin/sh -c apt-get update && apt-get dist-upg 32.67 MB
0ea0d582fd90 9 days ago /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$/ 1.895 kB
d92c3c92fa73 9 days ago /bin/sh -c rm -rf /var/lib/apt/lists/* 0 B
9942dd43ff21 9 days ago /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic 194.5 kB
1c9383292a8f 9 days ago /bin/sh -c #(nop) ADD file:c1472c26527df28498 192.5 MB
511136ea3c5a 14 months ago 0 B
Finally, you can start the container itself with the following command:
$ docker run -p 5000:5000 simple_flask:dockerfile python hello.py
Notice that in this example we're running the Flask app directly when we start the container, rather than just running the bash shell and starting it as we've done in other examples.
Dockerfiles provide a relatively simple way to create a base image. And, because you can use the FROM command to chain Dockerfiles together into increasingly complex images, you can do quite a lot, even with Docker's (refreshingly!) minimal command set. But, if you already have an existing IA tool (and you should!), such as Chef, Puppet, Ansible, Salt, it's very unlikely you could or even should rewrite everything. So, if you're in this situation what can you do?
I HAVE NO IDEA! I NEED TO RESEARCH THIS MORE, BUT I THINK YOU CAN USE A TOOL LIKE Packer. MAYBE I CAN CONVINCE Jeroen Janssens to do something with his Ansible stuff
Now that we've got the basics, let's make something a tiny bit more realistic: a Flask app. In this section we'll:
- Start a basic container named "simple_flask" based on Ubuntu
- Install the dependencies
- Install the app itself (which is just the "Hello, World!" example, so it's nothing exciting)
- Commit the container to a new image for our app
- Start a new container based on our image
- Access our app using a browser
Again, the point here is not to show the best way to set up an environment, but instead to illustrate the Docker commands and what they do. I'll code development environments in more detail later.
Our first step is to start a new container based on "ubuntu:latest." As we did in the previous example, we'll make it interactive using the "-it" options, but this time we'll give it our own name:
$ docker run -it --name="simple_flask" ubuntu:latest /bin/bash
The nice thing about using a name is that you can use it in other docker commands in place of the container id. For example, you can use this command to see the top process running on the container. (Also, it's worth noting that you need to run this on a terminal on your host, not in the container itself.)
$ docker top simple_flask
PID USER COMMAND
969 root /bin/bash
Now that we've got the container running, we install the dependencies we need. First, since Flask is a Python micro framework, let's check out what version of Python we have:
root@2f5ada6523c4:/home# python --version
bash: python: command not found
"Wait a minute!" you might be asking yourself, "What gives? Shouldn't Python be installed on Ubuntu?" And, the answer is "Yes," on a "real" Ubuntu distribution it is. But, to save on size, the Docker base images are stripped down versions of the official images. This is important to keep this in mind as you build new containers: don't take it for granted that everything on the official distribution will be present on the Docker base image.
So, let's get a minimal python environment going. First, we'll update our apt repository:
root@2f5ada6523c4:/home# apt-get update
This will churn away for a while updating various packages. Once it's complete, install python and pip:
root@2f5ada6523c4:/home# apt-get install -y python python-pip wget
This churns even longer, but once it's done, you can install Flask:
root@2f5ada6523c4:/home# pip install Flask
Whew. That's a lot of stuff to install, so let's commit the image with a simple commit message:
$ docker commit -m "installed python and flask" simple_flask
c21c062b...
So, now if you check your "docker images" you'll see the new repository:
admins-air-5:~ odewahn$ docker images
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
<none> <none> c21c062b086a 2 days ago 385.7 MB
ubuntu latest c4ff7513909d 7 days ago 225.4 MB
But, ugh, it doesn't have a name -- it just shows up as "<none>". We can use docker tag to give our image a nice name, like this:
$ docker tag c21c062b086a simple_flask
$ docker images
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
simple_flask latest c21c062b086a 2 days ago 385.7 MB
ubuntu latest c4ff7513909d 7 days ago 225.4 MB
Finally, taking a quick look at the history shows that we've loaded about 160MB worth of new stuff onto our image:
$ docker history simple_flask
IMAGE CREATED CREATED BY SIZE
c21c062b086a 2 days ago /bin/bash 160.3 MB
c4ff7513909d 7 days ago /bin/sh -c #(nop) CMD [/bin/bash] 0 B
cc58e55aa5a5 7 days ago /bin/sh -c apt-get update && apt-get dist-upg 32.67 MB
0ea0d582fd90 7 days ago /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$/ 1.895 kB
d92c3c92fa73 7 days ago /bin/sh -c rm -rf /var/lib/apt/lists/* 0 B
9942dd43ff21 7 days ago /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic 194.5 kB
1c9383292a8f 7 days ago /bin/sh -c #(nop) ADD file:c1472c26527df28498 192.5 MB
511136ea3c5a 14 months ago 0 B
Let's get our code installed. First, fire up a new container:
$ docker run -it -p 5000:5000 simple_flask /bin/bash
As before, we'll use the "-it" options to make it interactive, but we're now adding a couple of new settings:
- The "-p 5000:5000" option exposes port 5000 on the container and routes it to port 5000 on the host. If you're running Docker on a Mac, the confusing thing to remember is that from the Docker container's perspective, the virtual machine is the host, so you'll also need to expose port 5000 from the VM in order to see it on your "real" host OS. We'll cover this Inception-like step in the next section.
- The "-w /home" option sets the working directory used when the container starts. Note that this must be an absolute path.
Once we're in, we'll grab our simple hello world file from this projects github repo:
# wget https://raw.githubusercontent.com/odewahn/docker-jumpstart/master/examples/hello.py
--2014-08-17 05:38:02-- https://raw.githubusercontent.com/odewahn/docker-jumpstart/master/examples/hello.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 199.27.78.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|199.27.78.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 149 [text/plain]
Saving to: 'hello.py'
100%[===========================================================================>] 149 --.-K/s in 0s
2014-08-17 05:38:02 (1.34 MB/s) - 'hello.py' saved [149/149]
root@2f5ada6523c4:/home# cat hello.py
from flask import Flask
app = Flask(__name__)
@app.route('/')
def hello_world():
return 'Hello World!'
if __name__ == '__main__':
app.run(host='0.0.0.0')
Finally, once you've got the file, start the app. You should see something like this:
root@2f5ada6523c4:/home# python hello.py
* Running on http://127.0.0.1:5000/
We're ready for the big reveal. On a terminal on your host, try this command:
$ curl localhost:5000
If you're running this on Linux as your host OS, you should see "Hello World!" printed. If you're running it on boot2docker on a Mac or Windows, you most likely got this error:
curl: (7) Failed connect to localhost:5000; Connection refused
As I mentioned earlier, when you started the container and told it to expose port 5000, it exposed it on the virtual machine, not on your host. As described in the chapter on boot2docker, you need to use VBoxManage to open the port from the VM onto your host, like this:
$ VBoxManage controlvm boot2docker-vm natpf1 "flask-server,tcp,127.0.0.1,5000,,5000"
You should now be able to connect to the simple app, like this:
$ curl localhost:5000
Hello World!
If you're really excited about your "Hello World!" Flask app, you should feel free to commit it. Otherwise, let's kill it. First, we need to figure out its ID, which we'll do using "docker ps":
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
2f5ada6523c4 simple_flask:latest /bin/bash 3 days ago Up 34 minutes 0.0.0.0:5000->5000/tcp nostalgic_goodall simple_flask
We can kill our simple_flask app running as container "2f5ada6523c4" like this:
$ docker kill 2f5ada6523c4
2f5ada6523c4
To verify it's gone, lets run "docker ps" again, but this time add the "-a" option so that we see all containers:
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
2f5ada6523c4 simple_flask:latest /bin/bash 3 days ago Exited (-1) 2 seconds ago nostalgic_goodall
55ba5b67cb28 simple_flask:latest /bin/bash 3 days ago Exited (0) 40 minutes ago dreamy_mayer
6029929a1e53 ubuntu:latest /bin/bash 3 days ago Exited (0) 13 hours ago naughty_leakey
ef9bd2df07c6 ubuntu:latest /bin/bash 4 days ago Exited (0) 13 hours ago simple_flask
As you'll see, although the container has stopped running (i.e., its status has changed from Up to Exited), the container itself is still there. In fact, unless you use the "-rm" option when you start a container, it will always leave this remnant image behind. And, if left unchecked, after a while you'll consume your entire disk with stopped containers.
Why is it like this, you might ask? The answer lies in Docker's need to get a clean files state for a commit. In a container is running, it can mean that there are open files or processes that could interfere with the ability to save the state of the filesystem. So, rather than destroy the container automatically, Docker saves it to enable you to get a nice, clean commit image. So, killing the image is really the same as just pausing it.
So, to get rid of this ghost container, you need to use "docker rm":
$ docker rm 2f5ada6523c4
2f5ada6523c4
But, what about all those other stopped containers hanging around? Never fear! This great tip from Jim Hoskins in Remove Untagged Images From Docker shows a quick way to remove stopped containers in bulk:
$ docker rm $(docker ps -aq)
The Docker Hub is a site where you store and share images you create. Consciously modeled on GitHub, the site offers features like:
- Docker image hosting. Like GitHub, public images are free and private images are paid
- Collaborators who can push and pull the images
- Webhooks that are fired when a new image is pushed
- Statistics tracking, such as downloads and stars
There are two key ways to interact with the Docker Hub:
- The Web interface, where you can register, manage image metadata (description, etc), add or remove collaborators, and so forth. This is similar to GitHub.
- The docker command line tool, where you can pull, push, or search for images. This is similar to git.
There are a number of features specific Docker. One of the most interesting, trusted builds, allow you to create signed images based on a Dockerfile in GitHub or BitBucket. As with a CI/CD tool, the trusted build is triggered by a post commit hook on your repo so that your image will also be up to date with your codebase. In addition, the image is flagged with a "trusted build" badge so that you can verity its source.
The first to using Docker hub is to sign up, which you can do on their Signup page. You'll be asked for an email address, username, and a password. Once you login, you'll see a project dashboard that's similar to other hosting sites:
Once you've set up your account, you use the docker login
command from your terminal so that you can pull and push images from your account. (Note that you can also register with this command, as well, if you want to not use the site.) Once you login, your docker will store your credentials in a file called .dockercfg
in your home directory.
One of the most basic things you can do with the Hub is search for other images using docker search
. Here, for example, are the first few results returned by the docker search ubuntu
command. As of this writing, we get over 1300 matches, so be prepared to wait a bit for the results.)
$ docker search ubuntu
NAME DESCRIPTION STARS OFFICIAL AUTOMATED
ubuntu Official Ubuntu base image 525 [OK]
stackbrew/ubuntu Official Ubuntu base image 40 [OK]
dockerfile/ubuntu Trusted Ubuntu (http://www.ubuntu.com/) Bu... 19 [OK]
crashsystems/gitlab-docker A trusted, regularly updated build of GitL... 19 [OK]
ubuntu-upstart Upstart is an event-based replacement for ... 10 [OK]
dockerfile/ubuntu-desktop Trusted Ubuntu Desktop (LXDE) (http://lxde... 9 [OK]
lukasz/docker-scala Dockerfile for installing Scala 2.10.3, Ja... 7 [OK]
litaio/ruby Ubuntu 14.04 with Ruby 2.1.2 compiled from... 7 [OK]
mbentley/ubuntu-django-uwsgi-nginx 6 [OK]
cmfatih/phantomjs PhantomJS [ phantomjs 1.9.7, casperjs 1.1.... 5 [OK]
zmarcantel/cassandra Cluster/Standalone DataStax Cassandra on U... 4 [OK]
suttang/gollum The gollum wiki installed on ubuntu 4 [OK]
...
The command returns the following columns:
- Name. This is the name and namespace of the image. (More on namespaces in a bit.)
- Description. This is metadata about the image from Docker Hub.
- Stars. The number of people who have "starred" the image
- Official. Whether or not the image comes from Docker, Inc. An "OK" in this column helps provide some sense of security that the image has been vetted.
- Automated. Whether or not the image is the result of an automated build. While less authoritative than an "Offical" image, an automated image will provide a link to the source GitHub or BitBucket repo on which the image is based. This way you can at least review the contents of the image before you start it on your environment.
As you undoubtedly noticed, there are a couple of formats in the "Name" column. Some images have just a single name, like "ubuntu," and some have two names separated by a slash, like "crashsystems/gitlab-docker." This naming convention is a reflection of Docker's namespace. There are two namespace main levels:
- Root. This is the namespace is reserved for "root" images maintained directly by Docker, Inc. Basically, it's an "empty" space, and the names are things like "ubuntu" or "debian."
- User. This namespace is for users within the Docker Hub, such as "crashsystems/gitlab-docker," "mbentley/ubuntu-django-uwsgi-nginx," or "suttang/gollum"
Pulling any of these images is simple: just use docker pull
and whatever the name is. For example, here's how you pull all layers of the "stackbrew/ubuntu" image:
$ $ docker pull stackbrew/ubuntu
Pulling repository stackbrew/ubuntu
822a01ae9a15: Pulling dependent layers
c4ff7513909d: Download complete
195eb90b5349: Pulling dependent layers
bac448df371d: Download complete
dfaad36d8984: Download complete
5796a7edb16b: Download complete
af82eb377801: Downloading [===========> ] 16.38 MB/68.7 MB 164h59m35s
3af9d794ad07: Downloading [========================> ] 19.07 MB/39.53 MB 55h25m0s
93c381d2c255: Downloading [================> ] 12.58 MB/39.17 MB 109h9m35s
1c9383292a8f: Download complete
...
As discussed in the chapter on images, you can also pull only the layers of the latest build by adding ":latest" to the name. (Otherwise you'll pull the full history.)
Let's try to push an image to the hub. Unlike the previous search
and pull
commands, you must be logged in to push. If you're not already logged in, Docker will prompt you for your credentials
$ docker push simple_flask:latest
2014/09/03 12:42:10 You cannot push a "root" repository. Please rename your repository in <user>/<repo> (ex: odewahn/simple_flask:latest)
This rather self-explanatory error occurs because we didn't specify a username when we created the image. To push to the Docker Hub, though, you must specify a username. (Unless you're luck enough to be making Docker's official images, in which case you probably don't need this book!). So, let's add our username using the docker tag
command:
$ docker tag simple_flask odewahn/simple_flask
Once you've retagged the image, you can push it to the Hub:
$ docker push odewahn/simple_flask:latest
The push refers to a repository [odewahn/simple_flask] (len: 1)
Sending image list
Please login prior to push:
Username: odewahn
Password:
Email: [email protected]
Login Succeeded
The push refers to a repository [odewahn/simple_flask] (len: 1)
Sending image list
Pushing repository odewahn/simple_flask (1 tags)
511136ea3c5a: Image already pushed, skipping
1c9383292a8f: Image already pushed, skipping
9942dd43ff21: Image already pushed, skipping
d92c3c92fa73: Image already pushed, skipping
0ea0d582fd90: Image already pushed, skipping
cc58e55aa5a5: Image already pushed, skipping
c4ff7513909d: Image already pushed, skipping
c21c062b086a: Image successfully pushed
Pushing tag for rev [c21c062b086a] on {https://cdn-registry-1.docker.io/v1/repositories/odewahn/simple_flask/tags/latest}