Running Kubernetes the very hard way has been one of the most time consuming project of the last few months. I faced a lot of challenges, most of them related to running on bare metal, but have been able to find nice solution to all of them.
When I got a real HomeLab node capable of running virtual-machines, I wondered if it was possible to create an as cloud-alike setup as possible using components fitting on a single table. And it worked!
To get a overview of the full hardware-layer, it is described in HomeLab. But the core component for Kubernetes is Proxmox, the hypervisor, for running the actual the virtual machines. Furthermore, FreeNAS provides storage.
The first challenge was .. actually installing Kubernetes. On bare-metal one loses most of the comfort of those cloud-based one-click solutions (GKE, EKS, etc.). And those installers like KubeSpray are also rather useless, as they are still deeply connected to AWS or GCP. But as I am running on my little x86-host, those were out as well. After a lot of searching, two solutions seemed promising: RKE by Rancher and Pharos from Kontena. I've tried both but went with Pharos, as it actually worked first try (sorry Rancher), and it employed kubeadm for the actual installation, which I liked as well. After three months and several burning clusters, I am still very happy with how Pharos turned out. The intelligence behind
pharos up saved my cluster several times (broken etcd – ouch, etc.).
kubectl get pods finally did return something, the next topic to address was persistence. How to get
PersistentVolumes without a provided
StorageClass? Luckily as it turns out, there was
nfs-client-provisioner, which I hooked up to my
FreeNAS instance and I was good to go .. dynamic
PersistentVolumes, created for my
PersistentVolumeClaims. And thanks to
ZFS, snapshots where easy although not as integrated as on public clouds.
Pharos defaulted to
weave-net for CNI which was fine because it just works™. But is not the end of the story, my applications needed outside access. But
LoadBalancer let me down, as there was no glue-code for LoadBalancers on bare-metal. But wait, there is
MetalLB, and whatever magic it uses, my
IP's on the local network and I was good to go.
nginx-ingress-controller did well and with
cert-manager connected to LetsEncrypt and my Vault-PKI the need for both external and internal encryption was satisfied!
Okay, so far it works. But nobody really knows what's going on. Time to address this!
Obviously, log messages contained a lot of information waiting to be utilized. But all components were logging into some files lost in the deeps of disks. This was quickly changed by letting FluentBit collect them and forward all logs into Elasticsearch. As the requirements of the
EFK-stack are a financial disaster, I went with free SaaS, namely Logsene by Sematext.
But there is more! All components expose those magic numbers called Metrics. And with Prometheus and CoreOS Prometheus operator it was a joy collecting and parsing them. And finally, Grafana satisfied my need for cool graphs!
At this point I invested enough time a disaster would hurt. Luckily I was already doing GitOps so I would be able to get up and running again quickly. If there wasn't the state. What if somebody decided to pull the
LAN from FreeNAS or Proxmox and
PersistentVolumes would go insane? Nobody likes corrupt data! To be protected against this, backups are needed. But on-premise there are no real redundant, one-click, cheap snapshots. So I needed something different.
restic is a great de-duplicating, end-to-end encrypted backup solution supporting the very cheap storage provider Backblaze. And AppsCode Stash is the missing Kubernetes integration.