builder v2

Builder v2

Current approach

Right now, the builder is very tightly coupled with docker core. The builder has the following roles:

process the build context
parse Dockerfile into sexp-expressions
evaluate individual sexp-expressions which may involves the following job:
- persist configurations i.e ENV, EXPOSE, VOLUME, etc.
- import external data into a new layer i.e ADD, COPY, etc.
- run extra commands that creates a new the layer i.e. RUN, etc.

Problems

current builder is not extensible. The only interface to the builder right now is Dockerfile, which in many cases are clunky / hard to extend / hard to deprecate.
hard to guarantee backward-compatibilities between different versions of Dockerfile and/or deprecate old Dockerfile instructions. Ideally I would want a Dockerfile to always be able to build even with new version of Docker and/or builder.

Proposed Approach

Semantics of processing the build context as well as parsing dockerfile should be separated from docker core. Docker core should only concern about processing the build layers ( ideally via a set of defined API ). For example a set of Dockerfile instructions can be translated into API calls as follow: ( all the calls are just example and may not be correct at all)

--> POST /build
  47c0 # build session id

FROM golang:1.3.1
  --> POST /images/create?fromImage=golang:1.3.1
  asdf
  --> POST /build/47c0/create?fromImage=asdf
  31f1 # container id
  --> POST /build/47c0/commit/31f1
  31f0 # layer id

ADD . /go/src/app
  --> POST /build/47c0/add?fromImage=31f0&path=/go/src/app
  <tar stream>
  f50c # container id
  --> POST /build/47c0/commit/f50c
  f41c # layer id

BUILD /go/bin # nested build
  --> POST /containers?fromImage=f41c
  f401
  --> POST /containers/f401/copy?resource=/go/bin
  </go/bin tar stream> # saved somewhere else so we can reuse after
  d405
  # not commit because we dont want cache here

FROM scratch
  --> POST /images/create?fromImage=golang:1.3.1
  scratch
  --> POST /build/47c0/add?fromImage=scratchf0&path=/go/bin
  </go/bin tar stream> # previously extracted /go/bin
  d789

ADD dnsdock /
  --> POST /build/47c0/add?fromImage=31f0&path=/go/src/app
  <tar stream>
  e234 # container id
  --> POST /build/47c0/commit/e234
  qwer # layer id

ENV CGO_ENABLED 0
  --> POST /images/create?fromImage=golang:1.3.1&change="ENV CGO_ENABLED 0"
  wert # layer id

By separating the core API from the builder, we are also able to separate the semantics of building/processing build context + dockerfile, which means that user can write his own build script that is not depends on Dockerfile, but instead programmatically processing build context and calling the build API directly. For example, we would be able to write an ansible builder that build ansible playbooks into a container by writing a custom wrapper around ansible connections

This also means that as long as a client has access to Docker's API endpoint, it can behave as a builder, which opens up several possibility of decouple build file semantics from the core. For example, we can have a builder that builds dockerfile v1 that will be run as:

# this starts a builderv1 that parse Dockerfile v1 inside /buildcontext
# and translate those into API calls
docker run \
  -v /var/run/docker.sock:/var/run/docker.sock docker/builderv1 \
  -v /tmp/buildcontext:/buildcontext /bin/builder

while dockerfile v2 will be run as:

# this starts a builderv1 that parse Dockerfile v2 inside /buildcontext
# and translate those into API calls
docker run \
  -v /var/run/docker.sock:/var/run/docker.sock docker/builderv2 \
  -v /tmp/buildcontext:/buildcontext /bin/builder

docker build -t image . will be translated into the corresponding docker run after processing the build context into /tmp/buildcontext

In the same spirit, an custom builder will be: ( taken from https://gist.github.com/tonistiigi/c7b539c2a1a0568020c6 )

# this starts a builderv1 that parse Dockerfile v2 inside /buildcontext
docker run \
  -v /var/run/docker.sock:/var/run/docker.sock nitrousio/builder \
  -v /tmp/buildcontext:/buildcontext /bin/builder

> cat /bin/builder
#!/usr/bin/env nodejs
var docker = require('docker-builder')

var container = docker.New()

container.copy('/context/foo', '/bar')
container.commit() // also sets itself to next

container.env('FOO', 'bar')
container.workdir('/foo/bar')

container.commit()

docker.tag(container)

NOTE: However Core API also depends on some of the Dockerfile instruction for changing image configuration ( see commit --change, or import --change ), so we might also have to maintain the (latest?) parser inside the core.

Are so many new Remote API endpoints even needed? I would have though that after docker commit --change and scp compatible docker cp we would have all the use cases covered.

To keep the maintenance workload under control we should avoid creating the same functionality in many places and try more to inherit it from where its already implemented.

I very much agree to this. As a matter of fact, i dont think the example API I wrote above is good at all. They at best just serve as examples of how the client should interact with the daemon.

One difficulty with this approach is caching. The build script will always need to run through(this is fast anyway) but API needs to be designed so that every step that adds a new layer has a cacheable criteria(like it does atm). Turning everything into a single layer based on the script checksum should not be acceptable.

I think that caching should be controlled by the daemon. The caching behavior is still have to be defined though. I'm leaning more towards some sort of transactional behavior where the client has to open/commit a transaction and every request in between will be counted towards the cache, and a single layer will be generated per transaction.

But I guess the general idea is to have an endpoint for downloading a tar from a container and another for starting a new layer hierarchy

Yes, i guess that would serve the purpose right ? The idea would be to extract part of a containers and put it into a container based on another image.

dqminh/docker-builder.md

Builder v2

Current approach

Problems

Proposed Approach

dqminh commented Nov 4, 2014

Uh oh!