Skip to content

Instantly share code, notes, and snippets.

@tazjin
Last active February 28, 2024 12:05
Show Gist options
  • Save tazjin/08f3d37073b3590aacac424303e6f745 to your computer and use it in GitHub Desktop.
Save tazjin/08f3d37073b3590aacac424303e6f745 to your computer and use it in GitHub Desktop.
Nix builder for Kubernetes

Nix builder for Kubernetes

Table of Contents

This document describes an idea for a Kubernetes controller capable of building container images on-demand based on Nix expressions.

Overview

The basic idea is that a custom resource could be added to the Kubernetes API server that describes container images, for example:

---
apiVersion: k8s.nixos.org/v1
kind: NixImage
metadata:
  name: curl-and-jq
data:
  tag: v1
  contents:
    - curl
    - jq
    - bash

This resource could be picked up by the controller and transformed into a Nix expression that calls buildLayeredImage (see this blog post by @grahamc for more information):

pkgs.dockerTools.buildLayeredImage {
  name = "curl-and-jq"; # from metadata.name
  tag = "v1"; # from metadata.tag
  contents = with pkgs; [ curl jq bash ]; # from data.contents
};

The controller could then build this image and serve it via the Docker registry protocol, basically pretending to be a "normal" Docker registry inside of the cluster.

Users could then use this image by specifying something like image:<address of controller>/curl-and-jq:v1 in a pod template specification.

Why?

When using Kubernetes I have - various times - found myself in a situation where I wanted a one-off image like the above (with a few simple tools installed), without having to go and rummage through Docker Hub or go through the cycle of building, publishing and then pulling my own image.

Additonally, software that is built with Nix out of the box could be easily transformed into a deployable image in this way.

Container images are currently like a stateful intermediate step between a description of the desired contents of an image (usually specified imperatively in a Dockerfile) and the declarative deployment specification. It would be nice (for example from a "GitOps"-perspective) to be declarative all the way and - by having repeatable, sandboxed builds - not needing to worry about keeping track of artefacts.

How?

Custom resources are simple to define, the controller itself should not be too complicated to implement either. I have previously written a toy project implementing a "fake" Docker registry API and successfully deployed it inside of a Kubernetes cluster, so most of the actual implementation consists of glueing things together.

More interesting are the definition of the structure of the custom resource and how it maps to the buildLayeredImage arguments, as well as additional parameters like NIX_PATH, overlays or fancy things like custom SSH credentials.

Most of the second part will probably come from a bit of experimentation.

Open questions

  1. Feedback from failed builds

    One issue I'm not yet sure about is how to best inform the user of a build failure. Traditionally in Kubernetes, declarative specifications of "desired state" lead to the creation of new API resources that represent actual state.

    A slight simplification of this may be to simply track state in the NixImage resource itself, i.e. adding a BuildPending status or similar when a resource is first detected and then changing it to Building, Built, Failed (and whatever else may come up).

    As of Kubernetes 1.11, custom resources can define extra columns that should be printed when using kubectl. This would make it possible to create simple kubectl-output like this for users:

    NAME          VERSION        Status
    jq-and-curl   v1             Built
    rustc         1.28           Building
    
  2. Extra parameters & configuration

    As mentioned above, users may want to configure a few extra parameters when building a Nix image. Many of these could be supported by having additional configuration parameters available in the resource definition (e.g. to set a NIX_PATH).

    It's not yet clear (to me anyways) which exactly those might be.

  3. Build location

    An interesting question is where the images should actually be built. Is the controller running nix-build inside of its pod? Is it spawning new pods that do the building? Should it "simply" leave it up to the user to configure a build machine via the normal Nix configuration? Binary caches? ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment