There are two ways of trying this out:
As long as you have access to a cluster, you can get Concourse to run workloads there.
However, you need to get Concourse running.
You can do so using the Makefile
included in here though:
# run postgres in docker
#
make db
# start a kubernetes cluster using `kind` with a modified CRI (because ..
# containerd) that whitelists `http`-based registries.
#
make cluster
# build, install, and run Concourse (make sure you run `yarn build`
# first to build the UI assets)
#
make run
That done, you can now access Concourse on http://localhost:8080.
I've uploaded an image (cirocosta/concourse:k8s-iter-1
) to DockerHub for those
wanting to try out the "in-cluster" experience, and repurposed "the Helm chart of
today" to support the "runtime of tomorrow".
# clone the branch of the chart where it includes the necessary RBAC configs for
# setting up a service account that's powerful enough
#
git clone \
--branch k8s-iter-1 \
https://github.com/concourse/concourse-chart
# clone the branch of this PR
#
git clone \
--branch k8s-iter-1 \
https://github.com/concourse/concourse
# get the `init` configmap into the cluster so that we can hold containers alive
# until processes are meant to be executed in them.
#
NAMESPACE=default make -C ./concourse init
# install the chart / render the templates with the sample values under
# `hack/k8s/values.yaml` from this PR's branch
#
helm dependency update ./concourse-chart
helm upgrade \
--install \
--values=./concourse/hack/k8s/values.yaml \
test \
./concourse-chart
with the chart installed, you can now port-forward ATC (http://localhost:8080) and get going.
ps.: your cluster must have support for pulling images from http-based
registries served as pods. you can have a local cluster running make -C concourse cluster
if you'd like to use kind
).
The execution of each step follows the same pattern as we're accustomed with
when using Garden, where we set the sandbox up, and then run the desired process
inside it ( db.Creating()
-> container.Create()
-> db.Created()
... ->
container.Run()
)
Because a build plan can be seen as a directed acyclic graph when it comes to dependencies, we can rely on that fact to dictate how each step either gathers inputs, our supplies outputs.
For each step, as long as its dependencies (including transitive ones) were able
to fulfill what they should (e.g., a get
successfully running /opt/resource/in
),
it'll be able to retrieve any artifact that it might need.
When running a step, atc
communicates with Kubernetes to create pods that
represent those steps.
Each step pod can be seen as a potential permutation of two configurations:
- i. having an "inputs fetcher" init container, responsible for fetching artifacts retrieves by dependnecies
- ii. having an "output streamer" sidecar container, responsible for providing artifacts to those who might depend on this step
For instance, task
in the example above would look like:
- having an "input fetcher" capable of retrieving data from
repository
andimage
- having an "output streamer" capable of streaming the artifacts it produces to the "bucket" step ("bucket" would pull from the output streamer endpoint)
so that in the end, we have this form of "peer-to-peer" communication between the pods themselves:
(arrow direction indicating "depends on")
Once done with the build, the regular internal Concourse container lifecycle
would take care of moving the pods in our DB for CREATED
state to
DESTROYING
, which the kubernetes implementation of a worker would then notice
the desire to not having certain pods, and then proceed with deleting them.
The current approach presented here is very focused towards not changing much of the current constructs in our codebase as is of today.
Despite demonstrating that it is possible to run Concourse on Kubernetes, the current design might raise some eyebrows.
Given that the container runtimes that kubelet
s communicate with need to trust
the registries that they interact with, we have to rely on whitelisting the internal
pod domain as trusted insecure registries.
Interestingly, this is already done by default on GKE.
To work around this, we can explore the avenue of having the images being pushed to a central registry in the cluster that's trusted by the kubelets.
As long as we make the process of getting "input in" and "outputs out" pluggable enough, we could have either/or.
It's very convenient to be able to just exec
(or attach to) a process via
apiserver
's exec
endpoints, making the current Concourse's container
lifecycle and process execution work very with pretty much no changes needed in
our code flow.
It might be that this does not scale though, with apiserver
being hit so hard
with a system like concourse
.
There can also be concerns about keeping that connection for a long time (e.g.,
steps that take a long time to finish their main executions), or the sheer size
of throughput necessary (steps that log tons to stderr
).
If we don't want to modify the imperative nature of ATC making the requests to
execute, we could have some form of "shim" that'd be mounted to every main
container in step pods, which would then be reached out to by ATC, performing
the task of exec
in processes there, and dealing with the log streams /
re-attaching.
Having a pod per Concourse container will necessarily mean that we'd have a pod for each resource scope id (assuming no use of ephemeral check containers).
That means that an installation with 5000 of those would be fetching 5000 pods on every worker tick.
This could be improved by leveraging the same mechanisms that controllers do - perhaps even making this a controller itself that gets informed on changes to pods that match the label that we have?