Why Helm never felt like it belonged
Kubernetes is one the largest and fastest growing open-source projects. Since its inception in 2014, Kubernetes received tens of thousands of contributions from the community, and enhanced by plethora of new tools. But that doesn’t come without its downsides. Namely, every now and then, there’s a tool that doesn’t adhere to Kubernetes core principles, and I think one such tool is Helm.
I have tried for so long to understand why Kubernetes community chooses Helm over the other tools, especially given my experience with a tool that has some similar features to it (as you’ll see later). To finally get some insight, I decided to write this blog post to learn more about Helm and its community.
If you search on Google for Kubernetes principles, you are unlikely to find this document. While the principles outlined are very succinct, most of Kubernetes developers could probably talk about each of these ideas at length. One such talk, although probably not exhaustive, comes from Saad Ali at KubeCon. If you’re not deeply familiar with Kubernetes, I highly suggest to give it a listen before you read the rest of the text. Now without delaying any further, here are the principles mentioned in the talk, and why I think Helm breaks some of these principles.
- Kubernetes APIs are declarative rather than imperative.
- The Kubernetes control plane is transparent. There are no hidden internal APIs.
- Meet the user where they are.
- Workload portability.
Declarative over imperative
One of the very first examples mentioned in the talk is about the downsides of client-side. Many of the same problems presented for scripting on the cluster is introduced by Helm as well. What if your machine dies? What if you lose connection? What if your colleague runs the same command at the same time? But also, what if you run a wrong version of Helm? The answer is - you’re possibly going to be in an inconsistent state between two versions. The underlying cause of it is that Helm workflow is imperative, and not declarative. As opposed to using kubectl + Kustomize, for example, whose joint job is to declare the new state of the Kubernetes cluster, Helm can be used to manage the lifecycle of the objects, and apply them in certain order - most notably with hooks.
When I mentioned lifecycle, I immediately thought of only one thing - Kubernetes control plane. As you’ll see in the related talk, it’s quite clear that only lifecycle managers should be controllers, and they should run inside the cluster itself. Strong push for server-side is most evident in recently introduced Server Side Apply (SSA), which is trying to fix some of the issues with kubectl by removing one of the last pieces of code that are running client-side.
Transparent control plane, no hidden internal APIs
One of my favourite features of Kubernetes is its transparency. Everything goes through a single endpoint, and everything is available to all parties (with right access permissions). This means that I can start using new piece of software for Kubernetes and inspect what it’s doing to the cluster. That’s why when I think about working with Kubernetes, I mostly think about using kubectl.
With Helm, this is rather different. It is actively trying to hide what it’s doing by providing abstractions between you and the actual state of the cluster. This comes in two forms - CLI and Charts. The former creates new concepts of working with the cluster, that are not consistent with declarative intent of Kubernetes project. Instead of declaring the state of the cluster, it introduces additional commands to describe how to get to desired state of the cluster, also further violating the first principle. More importantly, both hide the information about what Kubernetes objects are being changed on the cluster. Charts especially encourage users to ignore underlying Kubernetes objects by providing additional parameterization abstraction over them. This inevitably ends up being restrictive for end users who want to change fields that are not parametrized by Helm and thus cannot be changed without creating your own copy of the Chart.
Meet the users where they are
One of the early problems of Kubernetes was that its API was pretty verbose and static. Since Kubernetes didn’t meet users with any solution at that time, it prompted many of them (including me) to try something different other than using pure kubectl & YAML. Roughly four years ago, at the time when I was first faced with this problem, I was just at the start of my journey with Ansible. Having not too much experience with Kubernetes, and seeing how much Ansible is flexible and given that it already had the support for interfacing with Kubernetes, I decided to give Ansible and its Jinja templating a try. Things didn’t work out so great.
A year later, I learned about Helm which aimed to solve this same (and more). The trouble is, to me it looked exactly like Jinja. It used a similar structure-unaware templating engine to generate structured data. Most notably, the engine is partially unaware of indents needed to create YAML structures. So, in order to make generated document properly indented, you need to resort to ugly hacks like explicitly specifying number of indents or prefixing blocks with whitespace truncation. And this really felt like trying to put Legos together with glue, all the while being blindfolded. Since then, Helm was supposed to get Lua support which would address the drawbacks of using a general purpose templating engine, but it hasn’t happened yet. Even if it did at some point, bringing Lua as addition only, without deprecating the current templating engine, won’t solve the original problem.
Everything about Helm seems to me like an “obvious solution” (as Saad Ali put it) to the problem of deploying workloads on Kubernetes. Just like my first approach, it gave an answer to “What” to do about this problem, but not the “Why” it’s done that way. The currently used Go templating engine, used to generate Helm manifests, could’ve easily been Jinja, if the project was started in Python. Or even simpler, it could’ve been Python itself with no extra libraries to generate structured YAML outputs from dicts.
Nowadays, there seems to be much better tools for composing pieces of structured data. Kustomize, jsonnet, CUE, ytt, Dhall are just some of the tools that are simply better fitted for the job. Similar goes for managing the lifecycle of the applications. Kubernetes, for example, has built-in controllers for managing deployments and statefulsets. These might not fit your use-case, and you certainly don’t have to agree with how they are implemented, but you’re entirely free to build your own controller/operator to manage lifecycle of your apps in a different way. However, Helm has chosen to stay client side. Something that’s increasingly moved away from, and most obvious with the addition of SSA.
Meeting long-time Kubernetes users the other way
On a similar topic, whilst Kubernetes tried its best to build the tools in place to make the new users feel welcome, Helm did not seem to bother trying to apply the same principle for existing Kubernetes users. I previously mentioned how Helm binary is trying to replace kubectl. As a long time Kubernetes user and operator, I really (and I mean reallyyyyyy…) have gotten used to be able to view transparent APIs through the single lens of kubectl to debug and solve issues. Helm instead takes this power away from you, and fragments the ecosystem.
Going in the wrong direction
To get a sense whether the Helm project is at least going in the right direction of correcting some of the design flaws I mentioned above, I looked at some of the new features they’re adding. Two of them are chart values validation, and a post renderer. While you might argue that both of these features are very useful (and I wouldn’t dare to deny that), both of them seem to be breaking either the Kubernetes design or engineering principles.
The post-renderer comes as a solution to the problem that Helm itself introduced. Helm charts that are published are not fully customisable by the users. This is because Helm broke one of the key ideas of Kubernetes and decided to hide and abstract Kubernetes objects away from the users. To solve this issue, the Helm project introduced post-rendering feature. While I was initially excited about the idea of Helm embracing configuration engines other than Go templating (such as Kustomize, which is explicitly mentioned in the docs), the feature still doesn’t fully adhere to some of the same core ideas of the Kubernetes project. Specifically speaking, post-rendering is not a declarative feature. You cannot specify in code that a post-render must be used to install a particular Helm chart. Rather, it relies on manually specifying the correct flag on issuing the helm command, and thus breaks declarative intent of configuration management.
It’s even more important to note that this feature goes directly against the core idea of Helm - abstraction. If you want to use post-rendering, you probably need to be intimately familiar with the templates of the upstream Chart. Even though such move is welcome to a certain extent, Helm design clearly continues to prioritize abstraction over transparent APIs and additionally introduces some very puzzling and contradictory design.
Chart values validation
One of the most prominent features of Helm Charts is the ability to abstract away configuration of Kubernetes objects and instead give the user the ability to provide high-level settings they would like to set for their installation of the particular Chart. These high-level settings are configured in the shape of a free-form YAML document. Unfortunately, free-form YAML is not exactly best suited for providing parameters with predefined structure. If you want to change anything about your deployment, the usual way to do so is to look up the default values (present in the Helm Chart), and then specify overrides for what you want to change. But default values are not always exhaustive and thus provide only partial documentation. So in order to make sure you’re changing what you want, or to see if some setting can even be changed, you end up just reading the Chart templates - the exact thing that Helm wanted to avoid. Once more, Helm tried to solve a problem it introduced by creating another feature - Validating Chart Values with JSONSchema. While again, you might say that this feature is completely sensible, some people will recognize this as something that already exists in Kubernetes ecosystem. I’m talking about CRDs, which not only enable a way to validate the values of Kubernetes objects being applied to the cluster, but also provide a clear and complete documentation of all the possible settings. Yet again, Helm looks like it’s reinventing the wheel.
Kubernetes was built on small components that each do one thing, and they do it well. There were discussions to split up Helm in multiple components as well, but right now, it’s an ever-growing monolith, contrary to the Kubernetes design principles. The purpose of Helm is supposed to be a package manager for applications on Kubernetes. And honestly, if this was the only way Helm was used, I’d probably not only be fine with it, but embrace it. The problem is that Helm tries to be much more than that, and in a wrong way. It’s trying to be a configuration engine, kubectl, and app & rollout lifecycle manager on the client side. And on top of all that, also a package manager.
From all the functionality that I’ve seen, it seems to me that Helm should be a package manager for well known applications that need to be distributed globally. Additionally, architecture would held-up much better if individual Charts were implemented as a CRD, and Helm as a general purpose app controller, rather than a CLI tool. To be fair, some of these points were also a topic of various discussions like in Helm 3 proposal documents created two years ago, or a Helm summit mentioned in the Kustomize whitepaper, but unfortunately it didn’t get much traction.
Helm has a very large community, and it has only gotten better since it was initially released. I hope the trend continuous so that Helm integrates better with the ecosystem in the future. So if you enjoy using the tool, I’m not here to convince you otherwise. But if you have been left with a desire for more, and if you recognize what has the Kubernetes project brought to the software engineering community so far, I wholeheartedly recommend to try out different tools that are trying to adhere to core Kubernetes principles. One such tool is already mentioned Kustomize. Check out the resources section of Kustomize to get a good introduction. And if you’re ready for more, explore Kustomize reference and learn how to extend Kustomize with plugins.
Finally, to give you some idea how the world outside the Helm circle could look like, I’m happy to say that I’m looking forward to trying out a combination of Kustomize for configuration engine, flux2 for app lifecycle management, and flagger for app rollout.