Docker is a tool that follows the Cattle, no pets DevOps mantra. It describes your hosting environment via a Dockerfile. Each system deployment results in an entirely new reprovisionned hosting environment while the other is decommissioned in the background. Docker achieves this by making it cheap to create new system images to spawn fleets of new containers.
- Quick example recap
- Popular commands
- Key concepts
- Getting started
- The
Dockerfile
- Multi-stage builds
- Docker configuration
- Development workflow
- Networking
- Popular base images
- Tips and tricks
- Best practices
- FAQ
- How to delete all the containers on my local machine?
- How to delete all the images on my local machine?
- How to run a container in the backgound?
- How to pass variables to the
docker build
command?- How to configure Docker to use other registries?
- How read or stream a container's logs?
- How to set the PATH in the
Dockerfile
?- How to create a global
ARG
in a multi-stage build?- Annex
- References
- Create a new NodeJS project:
mkdir my-app && \
cd my-app && \
npm init --yes && \
npm i express && \
touch index.js && \
touch Dockerfile
- Paste the following hello world API in the
index.js
:
const express = require('express')
const app = express()
app.get('/', (req, res) => {
res.send('hello world')
})
app.listen(3000, () => console.log(`Server ready and listening on port ${3000}`))
- Paste the following in the
Dockerfile
:
# Use the official lightweight Node.js 12 image.
# https://hub.docker.com/_/node
FROM node:12-slim
# Create and change to the app directory.
WORKDIR /usr/src/app
# Copy application dependency manifests to the container image.
# A wildcard is used to ensure both package.json AND package-lock.json are copied.
# Copying this separately prevents re-running npm install on every code change.
COPY package*.json ./
# Install production dependencies.
RUN npm install --only=prod
# Copy local code to the container image.
COPY . ./
# Run the web service on container startup.
CMD ["node", "index.js"]
- Build an image for this project:
docker build -t nodejs:v0 .
Where:
-t
is the tag option which allows to name the image, aka tag the image.<IMAGE NAME>:<VERSION TAG>
is the usual image naming convention, but you can change it to whatever you prefer..
means the current directory to build the image.
- Launch a new container:
docker run -p 127.0.0.1:4000:3000 nodejs:v0
This last command port-forward the traffic received on 127.0.0.1:4000 to the port 3000 inside the container.
Command | Description |
---|---|
docker image ls |
List all images. |
docker ps -a |
List all containers. The -a options includes the non-running containers. |
docker build -t my_image:v1 . |
Creates an image from a Dockerfile |
A running instance of an image is called container. An image is made of a set of layers. If you start an image, you have a running container for that image. You can have many running containers of the same image.
You can see all your images with docker images
whereas you can see your running containers with docker ps -a
.
A typical newbie scratch head is to wonder how to change the container's config after it has been started. The answer is you can't. For example, if a container with an app listening on port 3000 has been provisioned as follow:
docker run my_image:v1
The app in this container is listening on port 3000. It cannot receive traffic from outside because no port binding has been configured. It is not possible to change that container later. Instead, recreate a new container from that image as follow, and delete the previous container:
docker -p 127.0.0.1:3000:3000 run my_image:v1
Which means, port-forward traffic on 127.0.0.1:3000 to this container on its internal port 3000.
This approach highlights the typicall way of thinking with Docker. Because containers and images are cheap to create, you do not reconfigure them. Instead, you recreate them from scratch. That the DevOps Cattle, no pets mantra.
You define those 2 properties in the Dockerfile
. For example:
ENTRYPOINT ["echo”, "Hello"]
CMD ["World"]
With those 2 setup, starting a container with docker run <YOUR-IMAGE>
will execute the default command echo Hello World
. As you can see, the default command is just the concatenation of the entrypoint and the cmd.
Though those 2 can be strings, eventually, docker converts them to arrays, so it's usually less confusing to always use arrays.
To learn more about this topic, please refer to the ENTRYPOINT and CMD section.
- Install Docker
- Make sure it runs (launch the app on your Desktop. Not sure how to launch it from the terminal). If it is not launched (i.e., the daemon is not started in the background, the
docker
command will fail). - Create a
Dockerfile
. - Build the image:
docker build -t <IMAGE NAME>:<VERSION TAG> .
Where:
-t
is the tag option which allows to name the image, aka tag the image.<IMAGE NAME>:<VERSION TAG>
is the usual image naming convention, but you can change it to whatever you prefer..
means the current directory to build the image.
- List your containers:
docker ps -a
- Launch your container:
docker start -a <MY-CONTAINER-NAME>
This section explains the important differences between:
- Creating an image with
docker build
- Creating a container from an image with
docker run
- Starting an existing container with
docker start
- Executing commands in a running container with
docker exec
If you're still confused by the conceptual difference between an image and a container, please refer to the Image vs Container section.
Typical usage:
docker build -t my_image:v1 .
Where:
-t
is the tag option which allows to name the image, aka tag the image.my_image:v1
is the usual image naming convention, but you can change it to whatever you prefer..
means the current directory to build the image.
- If your project does not contain a
Dockerfile
and a.dockerignore
, create one. Typically, theDockerfile
imports the project's file you need into the image and defines a command that can start your project, or an entry point that allows to call your project's APIs. - Create the image for your project with
docker build -t <IMAGE NAME>:<VERSION TAG> .
(e.g.,docker build -t my-website:v1 .
). - Once that's done, you can see your image with
docker images
. - To delete your image, run
docker rmi <IMAGE ID>
Typical usage:
docker run -it my_image:v1
Where-it
allows to interact with the container's STDIN (-i
) via the terminal (-t
).
Once you have an image on your local machine, you can start a container with docker run <IMAGE_ID|IMAGE_NAME:IMAGE_TAG>
. To list the available images and their IDs, use docker images
.
Each time you use docker run <IMAGE ID>
, a new container is created and started. To test this command, run it multiple times and then execute docker ps -a
. You should see mutliple container for that specific image ID.
To delete the containers you don't need, run docker rm <CONTAINER ID>
.
Tips:
- Use
docker run -it <IMAGE ID>
if you need to interact with that container directly via the terminal (-i
means STDIN and-t
means terminal).
Use docker start
if you simply want to start an existing container instead of creating a new one from the image.
- List all the containers:
docker ps -a
- You can start any container using either its
CONTAINER ID
or itsNAMES
:
docker start tender_bassi
This command allows to execute a command inside a running container. The most common use of it is to open a shell terminal in a container to start interacting with that container:
docker exec -it <CONTAINER ID> sh
# Use the official lightweight Node.js 12 image.
# https://hub.docker.com/_/node
FROM node:12-slim
# Create and change to the app directory.
WORKDIR /usr/src/app
# Copy application dependency manifests to the container image.
# A wildcard is used to ensure both package.json AND package-lock.json are copied.
# Copying this separately prevents re-running npm install on every code change.
COPY package*.json ./
# Install production dependencies.
RUN npm install --only=prod
# Configure Nuxt with the host, otherwise, it won't be reachable from outside Docker
ENV NUXT_HOST=0.0.0.0
# Copy local code to the container image.
COPY . ./
# Run the web service on container startup.
CMD npm start
IMPORTANT: The instructions order affects the speed at which Docker can create/recreate the image. More details about this topic in the The instructions order in your
Dockerfile
matters for performance section.
IMPORTANT: The instructions order in your
Dockerfile
matters for performance
Example:
# Comments can be used using the hashtag symbol
# Always start with FROM. This specify the base image
FROM ubuntu:14.04
# MAINTAINER is not required, but that’s a good practice
MAINTAINER Nicolas Dao <[email protected]>
# Creates a new environment variable called myName
ENV myName John Doe
# Add all the files and folders from your docker project into your image under /app/src
ADD . /app/src
# Add all the files and folders from your docker project into your image under /app/src2. To
# understand the difference between COPY and ADD jump to the next ADD vs COPY section
COPY . /app/src2
# Create a new data volume in your image. More info about data volume here.
VOLUME /new-data-volume
# By default, RUN uses /bin/sh. The following line is pretty easy to understand. More info here
RUN apt-get update && apt-get install -y ruby ruby-dev
# WORKDIR set up the working directory for RUN, CMD, ENTRYPOINT, COPY, and ADD. Each time you run,
# it will be relative to the previous working directory(unless you specify an absolute path).
# If you use a directory which does not exist, that directory will be automatically created.
WORKDIR /a # Create a new ‘a’ directory at the container’s root, and set it up as the working dir.
RUN pwd # /a
WORKDIR b # Create a new ‘b’ directory under ‘a’, and set it up as the working dir.
RUN pwd # /a/b
# ONBUILD is typically used in image intended to be used as base image. More details here
ONBUILD ADD . /app/src
# CMD is the command that will be run after your container has started. More info here
CMD["echo", "Hello world" ]
The following Dockerfile
defines a MSG
argument and a HELLO
environment variable:
FROM amazon/aws-lambda-nodejs:12
ARG MSG
ENV HELLO Hello ${MSG}
HELLO
is an environment variable accessible in all the systems running inside the container defined by this image. It is set to the text message Hello ${MSG}
where MSG
is an argument that can be set via the docker build
command:
docker build --build-arg MSG="Mike Davis" -t my-app
If you have multiple arguments:
docker build --build-arg MSG="Mike Davis" --build-arg AGE=40 -t my-app
If you need to set a default value for the ARG:
FROM amazon/aws-lambda-nodejs:12
ARG MSG=Baby
ENV HELLO Hello ${MSG}
ARG can also be nested:
FROM amazon/aws-lambda-nodejs:12
ARG MSG=Baby
ARG COOLMSG="you hot ${MSG}"
RUN echo ${COOLMSG}
Both exposes the same API: ADD|COPY <src> <dest. in the image>
. The difference is that ADD supports more use cases, where COPY only supports sources accessible from your docker project.
ADD
supports those following useful use cases:
- src can be a URI
- src can be a tar file. If the compression format is recognized, then the tar file will be automatically unpacked.
The best practice is to use COPY when possible. This is more transparent and obvious for anybody reading the Dockerfile.
RUN
allows you to customize your base image with additional configurations or artifacts. Each time RUN is executed, a new docker commit is performed. When docker is done with the build, you’ll be able that there are a commit for each RUN command.
Run can be use in 2 different ways:
- Shell form:
RUN <shell command>
- Exec form:
RUN ["executable", "param1", "param2", ... "paramN"]
, which will result in the following shell command: executable param1 param2 ... paramN
In reality, the shell form is just a shortcut for the following exec form: RUN ["/bin/sh", "-c", "shell command"]
.
WARNING: The exec form does not invoke a command shell(have no clue on what it invokes instead!!!). That means that exec form does not support variable substitution out-of-the box. If you need variable substitution while using the exec form, you’ll have to explicitly use the shell exec:
RUN ["echo", "$HOME"] # output: $HOME RUN ["/bin/sh", "-c", "echo", "$HOME"] # output: /users/you/
RUN uses a cache which is not automatically flushed between builds. To explicitly refresh that cache, use the following command:
docker build --no-cache ...
Each RUN will create a new layer. The best practice is to try to keep layers to a minimum. Please refer to the Carefull with single lines RUN commands section to learn more.
You define those 2 properties in the Dockerfile
. For example:
ENTRYPOINT ["echo”, "Hello"]
CMD ["World"]
With those 2 setup, starting a container with docker run <YOUR-IMAGE>
will execute the default command echo Hello World
. As you can see, the default command is just the concatenation of the entrypoint and the cmd.
Though those 2 can be strings, eventually, docker converts them to arrays, so it's usually less confusing to always use arrays.
ENTRYPOINT also work with scripts:
COPY ./docker-entrypoint.sh /
ENTRYPOINT ["/docker-entrypoint.sh"]
- The main difference between CMD and RUN is that CMD does not result in a docker commit. CMD’s purpose is not to modify the base image, but instead to execute a command as soon as the container is up.
- There can only be one CMD per Dockerfile. If there are more than one, only the last one will be executed.
To override CMD
:
docker run <IMAGE> Baby
This will print: Hello Baby
To override ENTRYPOINT
:
docker run -it --entrypoint /bin/bash <IMAGE>
This will allow to interact with the shell from inside the container.
This directive allows to execute other directive at image build time. This is usefull when building images based on other images. For example, let's image the following parent image called parentimage:0.0.1
:
FROM ubuntu:14.04
...
ONBUILD ADD . /app/src
...
The following child image does not worry about copying its content to /app/src
:
FROM parentimage:0.0.1
...
If the parent had been defined as follow:
FROM ubuntu:14.04
...
ADD . /app/src
...
This would have means that the content of the parent would had been added to /app/src
.
Original article: https://docs.docker.com/develop/develop-images/multistage-build/
Complex images require more layers to be built. Those additional layers increase the size of the final image, which impact performance. To keep the final image as slim as possible, the first popular pattern called the builder pattern
used a shell script that would run two Dockerfile sequentially. The fist one called Dockerfile.build
would build all the artefacts similarly to a build server. The image created by this Dockerfile.build would not matter. The artefact would then be passed to the production Dockerfile
, which would therefore be lean.
Later, Docker shipped a new feature called multi-stage build.
Let's imagine we want to package some NodeJS code to inject those files in another image. We don't really care about NodeJS or NPM, we only want the artefact. The following Dockerfile uses a minimal node image to build our project:
FROM node:14-slim
ARG FUNCTION_DIR="/opt/nodejs/"
RUN mkdir -p $FUNCTION_DIR
WORKDIR $FUNCTION_DIR
COPY package*.json ./
RUN npm i --only=prod
ENTRYPOINT ["/bin/sh"]
However, this image is still 168MB. Because we only care about the actual files, we can simply build the smallest image possible and pass it the built artefacts using the multi-stage build API:
FROM node:14-slim AS builder
ARG FUNCTION_DIR="/opt/nodejs/"
RUN mkdir -p $FUNCTION_DIR
WORKDIR $FUNCTION_DIR
COPY package*.json ./
RUN npm i --only=prod
FROM busybox
ARG FUNCTION_DIR="/opt/nodejs/"
RUN mkdir -p $FUNCTION_DIR
WORKDIR $FUNCTION_DIR
COPY --from=builder $FUNCTION_DIR ./
ENTRYPOINT ["/bin/sh"]
The image's size is now 1.8MB.
The Docker configuration is maintained in the ~/.docker/config.json
file.
By default, deploying new images targets Docker Hub, but you may want to deploy your images using other services (e.g., Google Cloud Container Register or AWS Elastic Container Registry (ECR)).
To add new registries, edit the ~/.docker/config.json
file by adding a new credHelpers
property as follow:
{
"credHelpers": {
"gcr.io": "gcloud",
"us.gcr.io": "gcloud",
"eu.gcr.io": "gcloud",
"asia.gcr.io": "gcloud",
"staging-k8s.gcr.io": "gcloud",
"marketplace.gcr.io": "gcloud"
}
}
Where each key represents a domain (e.g., gcr.io
) and each value represents a program (e.g.,gcloud
). The above example shows the exhaustive config for setting up Google Cloud Container Registry for the GCloud CLI.
As a matter of fact, designing the right container is about designing the right image. Docker excels at creating new images and spawning containers from them in milliseconds. This efficiency comes from its layers design. The original layers take time to be pulled from the registry (e.g., the first time you pull a linux distro preconfigured with NodeJS). But once those layers have been pulled, they are cached on your local system. As you're building your own layers locally, those layers are also cached, and that's why iterating throught building the image you need is so cheap. To test the image, you spawn a container from it. If the container needs to be fixed, you fix the underlying image definition (usually via its Dockerfile) generate a new image, spawn a new container. These are the basics behind the iteration process. You eventually end up with a lot of trash in your Docker images and containers. Simply delete them once you're done.
The iteration process is similar to this:
- Create a
Dockerfile
(and a new.dockerignore
) based on what you think you need. - Create a new image from that
Dockerfile
as follow:docker build -t <YOUR-IMAGE-NAME>:<YOUR-TAG> .
- Create a new container from that new image:
docker run -it <YOUR-IMAGE-NAME>:<YOUR-TAG>
- Start interacting with your running container:
docker exec -it <CONTAINER ID> sh
- Iterate though that process until you have the container you want.
To print the container's logs:
docker logs <CONTAINER ID OR NAME>
Thie issue with this command is that its a one off. If you need to stream the logs continuously, use thos instead:
docker logs --follow <CONTAINER ID OR NAME>
If you package an app in your container that listens on port 3000, and if you intend to expose that app outside of the container, then the following won't work:
docker run my_image:v1
Instead, you need to configure your container with port binding as follow:
docker -p 127.0.0.1:3100:3000 run my_image:v1
The above creates a container configure on the host to do port forwarding so that all traffic sent to the host on 127.0.0.1:3100 is forwarded to the container on its app listening on port 3000.
NOTE:
-p
stands for publish
Image | Description | Link |
---|---|---|
scratch |
Most minimal image usefull to build other images. It's pretty much the base image for all the others. | https://hub.docker.com/_/scratch |
busybox |
Minimal image with many UNIX tools preinstalled. | https://hub.docker.com/_/busybox |
alpine:<VERSION> |
Built on top of busybox , it is a very mininal Linux version. |
https://hub.docker.com/_/alpine |
node:<version> |
Self-explanatory. | https://hub.docker.com/_/node |
node:<version>-alpine |
Same as node:<version> but with alpine as the base image to save on space. Might not work in all scenarios. |
https://hub.docker.com/_/node |
The following example show how to use a LAYER_01
argument to load different layer image for an AWS Lambda image.
ARG LAYER_01
FROM $LAYER_01 AS layer01
FROM amazon/aws-lambda-nodejs:12
ARG FUNCTION_DIR="/var/task"
ARG LAYER01_PATH="/opt/layer01/"
# Copy layer01 files into the opt/ folder
COPY --from=layer01 /opt/nodejs/ $LAYER01_PATH
ENV NODE_PATH "${LAYER01_PATH}node_modules:${NODE_PATH}"
# Create function directory
RUN mkdir -p ${FUNCTION_DIR}
# Copy handler function and package.json
COPY index.js ${FUNCTION_DIR}
# Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
CMD [ "index.handler" ]
BAD EXAMPLE:
FROM node:12-slim
WORKDIR /usr/src/app
COPY . ./
RUN npm install --only=prod
CMD npm start
GOOD EXAMPLE:
FROM node:12-slim
WORKDIR /usr/src/app
COPY package*.json ./
RUN npm install --only=prod
COPY . ./
CMD npm start
All instructions in your Dockerfile create a new layer(1). Because layers are cached, and because a change to a layer forces the regeneration of all its following layers, it is recommended to start the Dockerfile with the instructions that rarely change. In the GOOD EXAMPLE, the first 4 lines should rarely change(2), while the 5th line (COPY . ./
) almost always change. Indeed, each time a file changes, Docker will consider the COPY . ./
as different, which will force the regenaration of all the following instructions. In the BAD EXAMPLE, this includes RUN npm install --only=prod
. This is a waste as the dependencies of a NodeJS project rarely change (for the readers unfamiliar with NodeJS, those dependencies are explicitly defined in the package.json
and package-lock.json
files). A better approach is to rewrite the Dockerfile as follow:
- (1) Only
RUN
,COPY
andADD
generate layers that impact the image size. The other instructions are called intermediate layers.- (2) Dockers uses the following criteria to determine whether to re-use a cached layer or not:
COPY
andADD
:
- When the command has changed;
- or when the content of the files copied has changed. Docker computes a hash of each file and uses that hash to determine if a file has changed.
RUN
only check the command. If the command has changed, the layer is regenerated. IMPORTANT: This means that if you must regenerate aRUN
layer, you must make sure that it is preceded by anADD
orCOPY
layer that has changed to.
Small means both bytes and numbers of files. All files in your project will be sent to the Docker daemon at build-time. If there are too many files, or if they are too big, you’ll experience bad performances. If you do have a lot of files, but some of them are not required by the build process, do use a .dockerignore file.
If files are not part of the build process.
Never write RUN apt-get update
on its own on a single line, otherwise, Docker will cache that resulting image, and no updates will ever be performed. Instead, use the following:
RUN apt-get update && apt-get install -y s3cmd=1.1.0.*
This is better because:
- Add or remove dependencies explicitly automatically invalidates the cache.
- Now the
s3cmd
's version is explicit, updating it will force Docker to update the cache.
It's also considered a best practice to organize your dependencies using multi-lines and alphabetical order:
RUN apt-get update && apt-get install -y \
aufs-tools \
automake \
btrfs-tools \
build-essential \
curl \
s3cmd=1.1.0.*
Replace this:
ADD http://example.com/big.tar.xz /usr/src/things/
RUN tar -xJf /usr/src/things/big.tar.xz -C /usr/src/things
RUN make -C /usr/src/things all
With this:
RUN mkdir -p /usr/src/things \
&& curl -SL http://example.com/big.tar.gz \
| tar -xJC /usr/src/things \
&& make -C /usr/src/things all
The second example is better than the first because the first creates a layer just for the ADD and keep the downloaded resources in the image for no reason.
Multi-stage builds allows to load all the tools and run all the expensive steps in other intermediate images and then load the final artefacts into a minimal image (small base image and minimal amount of layers). To learn more, please refer to the Multi-stage builds section.
docker stop $(docker ps -a -q)
docker rm $(docker ps -a -q)
docker stop $(docker ps -a -q)
docker rm $(docker ps -a -q)
docker rmi $(docker images -a -q) --force
This is called detached mode:
docker run -d my_image:v1
- Use the
--build-arg
option as follow:
docker build --build-arg HTTP_PROXY=http://10.20.30.2:1234 --build-arg FTP_PROXY=http://40.50.60.5:4567 .
- In your Dockerfile, just after the
FROM
, add the following commands:
ARG HTTP_PROXY
ARG FTP_PROXY
Please refer to the Adding other Docker registries section.
Please refer to the Logs section.
ENV PATH="/opt/gtk/bin:${PATH}"
ARG
are scoped per build stage. This means that each build stage MUST explicitly define its own ARG. For example, the following is incorrect:
ARG PULUMI_BIN="/opt/.pulumi/bin"
FROM alpine:3.14 AS builder
RUN mkdir -p $PULUMI_BIN
FROM busybox
RUN mkdir -p $PULUMI_BIN
The correct version is:
ARG PULUMI_BIN="/opt/.pulumi/bin"
FROM alpine:3.14 AS builder
ARG PULUMI_BIN
RUN mkdir -p $PULUMI_BIN
FROM busybox
ARG PULUMI_BIN
RUN mkdir -p $PULUMI_BIN
NOTE: Both block will use the same value.
Dockerfile
# Use the official lightweight Node.js 12 image.
# https://hub.docker.com/_/node
FROM node:12-slim
# Create and change to the app directory.
WORKDIR /usr/src/app
# Copy application dependency manifests to the container image.
# A wildcard is used to ensure both package.json AND package-lock.json are copied.
# Copying this separately prevents re-running npm install on every code change.
COPY package*.json ./
# Install production dependencies.
RUN npm install --only=prod
# Copy local code to the container image.
COPY . ./
# Run the web service on container startup.
CMD npm start
.dockerignore
Dockerfile
README.md
node_modules
npm-debug.log
The following config in your terminal config allows to use:
d
instead ofdocker
.drm
to delete all the containers.drmi
to delete all the images.dhost 4000:8080
to build an image and launch a container and redirect traffic on 127.0.0.1:4000 to port 8080 in your container.dit
to build an image and launch a container and start a terminal in it.dit --entrypoint /bin/bash
to force starting bash.dlocal 3000:3100 [other options]
instead ofdocker run -p 127.0.0.1:3000:3100 [other options]
alias d="docker"
function drm() {
docker stop $(docker ps -a -q)
docker rm $(docker ps -a -q)
}
function drmi() {
docker stop $(docker ps -a -q)
docker rm $(docker ps -a -q)
docker rmi $(docker images -a -q) --force
}
function dlocal() {
docker run -p 127.0.0.1:$1 $2 $3 $4 $5 $6
}
function dhost() {
docker build -t localapp .
docker run -p 127.0.0.1:$1 localapp:latest
}
function dit() {
docker build -t localapp .
docker run -it $1 $2 $3 localapp:latest
}