Every time a build starts in a new agent, the docker build cache is empty,
and thus a docker build
will need to rebuild the entire image from scratch.
To avoid this we can repopulate the docker cache when starting a build.
If your build is not using multi-stage yet, re-populating the cache is easier. The cache can be populated by pulling the latest image from registry before building:
docker pull service:dev || true
docker build -t service:dev .
However, using a multi-stage build can be good to:
- reduce final image size
- reduce build time by building layers in parallel
Take the following Dockerfile using multi stage build as an example:
FROM hexpm/elixir:1.10.4-erlang-23.0.2-alpine-3.11.6 as compile
WORKDIR /build
COPY something something2 /build
RUN build.sh --target /app
FROM alpine:3.11.6 as runtime
COPY --from=compile /app /app
ENTRYPOINT ["/app/run"]
(note: this is not a real Dockerfile)
It create a compile
stage using a base elixir image, copies the code, and
compiles it saving output compiled artificts to /app
.
Then, it creates a new (and final) stage called runtime
, that starts from a
minimal alpine image, and copies the compiled artifacts from the compile
stage.
What happens if you try to rebuild the cache the same was as before? If you build and push the image doing:
docker build -t service:dev .
docker push service:dev
And then try to repopulate the cache:
docker pull service:dev || true
docker build -t service:dev .
The build process will start from scratch. This happens because the final tagged
image is the runtime
stage. And that stage doesn't have the cache for the
compile
stage, which is most likely the heavy lifting and longer build step.
In order to cache it, you need to split your build to tag the intermediate step:
docker pull service:compile || true
docker pull service:dev || true
# Build compile stage using pulled image as cache:
docker build \
--target compile \
--cache-from=service:compile \
--tag service:compile .
# Build runtime stage:
docker build \
--target runtime \
--cache-from=service:compile \
--cache-from=service:dev \
--tag service:dev .
With that, the cache will be populated from image and reused in following builds.
NOTE: remember to push the tagged images to your registry ;)
Another improvement that can be made is using the newer docker build system: BuildKit (github repo)
BuildKit has been integrated with docker build
since version 18.06 (2018-07-18), but is still in development and not enabled by default.
Is is useful when you have a Dockerfile like this:
FROM something as base
WORKDIR /build
COPY something /build
RUN get-dependencies
FROM base as compile-backend
RUN compile-backend.sh
FROM base as compile-frontend
RUN compile-frontend.sh
FROM base as compiled
COPY --from=compile-backend /app /app
COPY --from=compile-frontend /assets /app/assets
FROM minimal-something as runtime
COPY --from=compile-backend /app /app
COPY --from=compile-frontend /assets /app/assets
ENTRYPOINT ["/app/run"]
In this example Dockerfile, compile-backend
and compile-frontend
have a shared dependency with base
, but can be build in parallel,
and then compiled
and runtime
waits for both, but can also be built in parallel.
The compiled
stage is used solely for caching purposes, and is the one that should be tagged and pushed to registry.
To be able to build this in parallel with BuildKit, you need to enable it:
export DOCKER_BUILDKIT=1
docker build \
--build-arg BUILDKIT_INLINE_CACHE=1 \
--target compiled \
--cache-from=service:compiled \
--tag service:compiled .
BUILDKIT_INLINE_CACHE
should be set so the pushed image contains all the cached layers, otherwise the cache won't be repopulated when pulling the image.
Here's how a BuildKit parallel build looks like:
(this is building
deps-dev
, deps-test
, deps-prod
and dialyzer
stages in parallel)
There is a new docker buildx
experimental feature that will better integrate docker build process with BuildKit. If you'd like to give it a try and report back, here it is: https://docs.docker.com/buildx/working-with-buildx/ :)
- https://github.com/moby/buildkit
- https://pythonspeed.com/articles/faster-multi-stage-builds/
- https://kgrz.io/caching-parallelism-in-docker-multi-stage-builds.html
- https://github.com/team-telnyx/infra-ci-pipelines-steps/pull/29
- https://docs.docker.com/buildx/working-with-buildx/
- moby/moby#34715