Why Cue is exciting for a Kubernetes Ecosystem Developer.md

Caveat: I am writing this based on my understanding of the tools I tried at the time. Things are constantly changing in k8s world. So, it is possible some of the comments here are outdated.

I don't mean to hurt anyone's feelings. I am trying to find tools that help me solve our customer problems. I am trying to make a decision that can help us deliver our products to our customers today.

Data vs Code: I think data, ie, YAML is the right layer for coordination among k8s ecosystem tools. YAML works great as the "wire format". But for human users a higher level YAML generation tool (aka, code) is often needed.

Our use-cases: We make k8s operators that we sell to users. In the beginning, we would offer users a curl | bash script. Over time, users started asking for customizations in installer scripts that became hard to do with just scripts. Also, scripts don't work in Windows. So, we started supporting both script and helm charts. Over time it became quite hard to keep these in sync and test. So, once Helm 3 was released, we switched to only using Helm 3.

Our users can use helm charts directly. But we also see users automating helm charts via terraform or GitOps tools these days.

We are also using an UI where users can deploy applications (databases in our case) via UI. Helm charts work great here. Users essentially generate a Values.yaml and the chart is applied to the cluster programmatically from a Go backend server.

Issues with kustomize/kpt/crossplane XDR type tools:

In my experience these tools are generally under powered or not expressive enough for what users are trying to do. Users occasionally want to write loops or if/else conditions when generating yamls.
In my view kustomize/kpt is basically re-inventing all the functionality of helm cli/template functions but with a worse UX. I don't think kustomization.yaml is any better than helm template files where all fields are templatized.
We want to run the YAML generator tool from a multi-tenant server backend. It is much better if we are making a call to a single Go library (helm) vs making exec calls that involve various random docker images as plugins (like kustomize/kpt). The general security implication of this makes me very uncomfortable.

Issues with Helm 3: For us Helm 3 works great for us in general. But there has been some bugs/limitations of Helm that has not solved for over 2 years and trying to understand their plans for future seem to lead to unpleasant twitter conversations.

CRD installation: Helm will not apply CRDs during helm upgrade. This makes it basically the upgrade operation for all our applications difficult and error prone.
We now have a considerably large number of helm charts. These are hard to test (does this combination of values generate valid YAML?) and modify (basically have to change them one by one manually.)
We have a number of cases where we need to read data from one YAML to another YAML. This is similar to terraform modules or crossplane XDRs.
I had compiled a wish list for Helm some time ago. https://docs.google.com/document/d/1YTpcmgeA7AxwQsZGNRANBGEP9Tx2IM6ScBoIV_VpC1Q/edit# . Some of these items are still valid and a pain point for us today.

Why Cue: I have tried to avoid re-inventing Helm. For a small company like ours, it is very hard to justify the cost of such undertaking. But given Helm is basically stuck in 2019 and some critical issues like CRD issues have not been resolved for all this time, I have started to explore alternatives. So, I started looking into writing a helm like tool that will replace Go templates with Cue and fix the issues I mentioned earlier.

Things nice about Cue:

Cue is a strict superset of Json. Any json file is already a valid Cue file.
Cue generates schema from Go types directly. This means we can generate schema for template files for official k8s types or crd types. And test them without any need for k8s. This is huge. I can imagine a potential cue-fuzz tool that can test the combination of values.yaml and check if the generated yamls are still valid.
Cue's order independence and module / packaging system means we can share "templates" across charts, or our users can potentially use this from their cue modules.
Cue is written in Go. So, I can access it programmatically and more importantly this is a single go library that can be invoked from a server side process.
Cue has built-in support for workflow and can auto-discover order. I am able to use this to implement terraform module style tool directly on top of k8s yamls.
Cue can work with any YAML/JSON, not just k8s yamls. This is handy when you need to embed json/yaml inside configmaps or secrets, like writing Prometheus alert rules. Example.
Cue's slack community seems to be more helpful so far.
Cue is seeing adoption among other users in the cloud native ecosystem, like Grafana using cue for their dashboard generation, Solomon's new startup dagger.io using cue for their product. This is exciting to see.

Where cue could improve:

It is one more thing to learn. Cue's docs have an academic feel to it. But after spending a week or so studying Cue, I think it's syntax is pretty intuitive. It is missing a "user guide" type doc that is more focused towards DevOps engineers. But that is easy to fix.
It seems fairly early in Cue land. But the core team is focused on it full time now. So, I am hopeful that these things will be resolved in time.

tamalsaha/Why Cue is exciting for a Kubernetes Ecosystem Developer.md