The new generic distributed content manifest format (schema version 2) [see distribution/distribution#62] is approaching final approval, and while some discussion has happened in-line, it seems reasonable to break out the various pieces that would be required to implement a multi-arch Docker image solution on top of the flexible manifest format being proposed.
As a starting point it is useful to discuss what the intended use case is for multi-architecture images in the Docker platform. The following requirements summarize the expected capabilities of the engine + registry + storage/retrieval format that is implemented.
- A specific repository
name:tag
manifest will need to contain the proper information for access/retrieval of multiple architectures- For example
docker pull ubuntu:15.04
should have the capability on a POWER system to pull the POWER-specific image layers which comprise an Ubuntu 15.04 image
- For example
- Not every
name:tag
manifests will have information for more than one architecture (e.g. the default amd64/linux images of today) docker push
ofname:tag
manifest should create a manifest which is {tagged|labeled|marked} appropriately for its arch/os settings (already encoded in Docker's layer/image metadata)- When other architecture manifests are pushed of the same
name:tag
a "merge-like" capability should be provided to create a "super-manifest" which references each of the architecture-specific manifests which are available for thatname:tag
- The push activity should not need to be coordinated across architectures: I may create an amd64/linux manifest today, and a ppc64le/linux manifest next week. A
GET
for that repository manifest path from my registry in week 1 should respond with one manifest supporting architecture/os pair amd64/linux, and in week 2 the sameGET
should respond with a super-manifest referencing the pushed amd64/linux and ppc64le/linux content manifests.
- When other architecture manifests are pushed of the same
Josh Hawn has provided a potential idea for this super-manifest/"fat manifest" using the capabilities already provided for in the generic manifest schema v2 referenced above. See his comment here: distribution/distribution#62 (comment)
The idea specifically is that given the list of dependencies
in a manifest each have a mediaType
, and in his example manifest he references a v1 image spec/config as mediaType: application/vnd.docker.container.image.params.v1+json
and the layers associated with it as mediaType: application/vnd.docker.container.image.layer+x-gtar
, there could be a mediaType called application/vnd.docker.container.image.combined+json
which has content like:
{
"linux/amd64": "docker.com/library/ubuntu@fa7139b345c6f88c8329e68d15864baf7d2b907ed435e3996ba983ab0ebcb7d1",
"linux/arm": "docker.com/library/ubuntu@e457c078b68d843c8e050b454936a600a0c5d133c094e42bb4fcda68f072e976",
"linux/ppc64le": "docker.com/library/ubuntu@7ccf96608588961b1e8e4cc3d7ec02906efb144b5b092f89c4292c74d5d1cde8"
}
Given the flexibility of the manifest format, these then become a layer of indirection for a client to algorithmically choose the os/arch combination required (or respond to an end-user that the requested content does not exist for their os/arch) and follow the pointer to get the manifest with dependencies containing the actual image params and layer data for that architecture.
- Distribution PR#62 needs to be finalized and agreed-to by all parties - underway as we speak.
- Need agreement across Distribution and Engine projects on the combined
mediaType
for fat/super-manifest content. - Docker daemon (or
dist
command?) algorithm forpull
needs to properly handle combinedmediaType
redirection based on os/arch of daemon - Docker daemon (or
dist
command?)push
implementation needs capability to create super-manifest when second os/architecture of the samename:tag
image is pushed to a v2 schema-supporting registry. - Any coordination required with swarm, machine, compose regarding the super-manifest?
Future: Orchestration clients may need a way to query/specify/hint os/arch requests. Use case: docker machine
is used to create Docker engines across a multitude of systems, including those of different architectures. When workloads are deployed using swarm
, and there are multiple architectures supported by the underlying machines and supported by the images used in the workloads being deployed, who will decide which architecture to use? Should there be a hinting/policy mechanism allowed to prefer POWER, for example, for certain images or workloads, and prefer amd64 for others?
This doesn't talk about APIs. Is that because its still TBD or because the API of the registry already supports the daemon specifying the arch/os on the push/pulls ?