Right now, the builder is very tightly coupled with docker core. The builder has the following roles:
- process the build context
- parse Dockerfile into sexp-expressions
- evaluate individual sexp-expressions which may involves the following job:
- persist configurations i.e ENV, EXPOSE, VOLUME, etc.
- import external data into a new layer i.e ADD, COPY, etc.
- run extra commands that creates a new the layer i.e. RUN, etc.
- current builder is not extensible. The only interface to the builder right now is Dockerfile, which in many cases are clunky / hard to extend / hard to deprecate.
- hard to guarantee backward-compatibilities between different versions of Dockerfile and/or deprecate old Dockerfile instructions. Ideally I would want a Dockerfile to always be able to build even with new version of Docker and/or builder.
Semantics of processing the build context as well as parsing dockerfile should be separated from docker core. Docker core should only concern about processing the build layers ( ideally via a set of defined API ). For example a set of Dockerfile instructions can be translated into API calls as follow: ( all the calls are just example and may not be correct at all)
--> POST /build
47c0 # build session id
FROM golang:1.3.1
--> POST /images/create?fromImage=golang:1.3.1
asdf
--> POST /build/47c0/create?fromImage=asdf
31f1 # container id
--> POST /build/47c0/commit/31f1
31f0 # layer id
ADD . /go/src/app
--> POST /build/47c0/add?fromImage=31f0&path=/go/src/app
<tar stream>
f50c # container id
--> POST /build/47c0/commit/f50c
f41c # layer id
BUILD /go/bin # nested build
--> POST /containers?fromImage=f41c
f401
--> POST /containers/f401/copy?resource=/go/bin
</go/bin tar stream> # saved somewhere else so we can reuse after
d405
# not commit because we dont want cache here
FROM scratch
--> POST /images/create?fromImage=golang:1.3.1
scratch
--> POST /build/47c0/add?fromImage=scratchf0&path=/go/bin
</go/bin tar stream> # previously extracted /go/bin
d789
ADD dnsdock /
--> POST /build/47c0/add?fromImage=31f0&path=/go/src/app
<tar stream>
e234 # container id
--> POST /build/47c0/commit/e234
qwer # layer id
ENV CGO_ENABLED 0
--> POST /images/create?fromImage=golang:1.3.1&change="ENV CGO_ENABLED 0"
wert # layer id
By separating the core API from the builder, we are also able to separate the semantics of building/processing build context + dockerfile, which means that user can write his own build script that is not depends on Dockerfile, but instead programmatically processing build context and calling the build API directly. For example, we would be able to write an ansible builder that build ansible playbooks into a container by writing a custom wrapper around ansible connections
This also means that as long as a client has access to Docker's API endpoint, it can behave as a builder, which opens up several possibility of decouple build file semantics from the core. For example, we can have a builder that builds dockerfile v1 that will be run as:
# this starts a builderv1 that parse Dockerfile v1 inside /buildcontext
# and translate those into API calls
docker run \
-v /var/run/docker.sock:/var/run/docker.sock docker/builderv1 \
-v /tmp/buildcontext:/buildcontext /bin/builder
while dockerfile v2 will be run as:
# this starts a builderv1 that parse Dockerfile v2 inside /buildcontext
# and translate those into API calls
docker run \
-v /var/run/docker.sock:/var/run/docker.sock docker/builderv2 \
-v /tmp/buildcontext:/buildcontext /bin/builder
docker build -t image .
will be translated into the corresponding docker run
after processing the build context into /tmp/buildcontext
In the same spirit, an custom builder will be: ( taken from https://gist.github.com/tonistiigi/c7b539c2a1a0568020c6 )
# this starts a builderv1 that parse Dockerfile v2 inside /buildcontext
docker run \
-v /var/run/docker.sock:/var/run/docker.sock nitrousio/builder \
-v /tmp/buildcontext:/buildcontext /bin/builder
> cat /bin/builder
#!/usr/bin/env nodejs
var docker = require('docker-builder')
var container = docker.New()
container.copy('/context/foo', '/bar')
container.commit() // also sets itself to next
container.env('FOO', 'bar')
container.workdir('/foo/bar')
container.commit()
docker.tag(container)
NOTE: However Core API also depends on some of the Dockerfile instruction for changing image configuration ( see commit --change, or import --change ), so we might also have to maintain the (latest?) parser inside the core.
To keep the maintenance workload under control we should avoid creating the same functionality in many places and try more to inherit it from where its already implemented.
I very much agree to this. As a matter of fact, i dont think the example API I wrote above is good at all. They at best just serve as examples of how the client should interact with the daemon.
I think that caching should be controlled by the daemon. The caching behavior is still have to be defined though. I'm leaning more towards some sort of transactional behavior where the client has to open/commit a transaction and every request in between will be counted towards the cache, and a single layer will be generated per transaction.
Yes, i guess that would serve the purpose right ? The idea would be to extract part of a containers and put it into a container based on another image.