KubeZero - ZeroDownTime Kubernetes Platform
Stefan Reimer
a2d8c53fcc
38a9cda Debug CI pipeline 3efcc81 Debug CI pipeline 5023473 Make branch detection work for tagged commits cdc32e0 Improve cleanup flow 8df60af Fix derp 748a4bd Migrate to :: to allow custom make steps, add generic stubs 955afa7 Apply pep8 5819ded Improve ECR public lifecycle handling via python script 5d4e4ad Make rm-remote-untagged less noisy f00e541 Add cleanup step to remove untagged images by default 0821e91 Ensure tag names are valid for remote branches like PRs git-subtree-dir: .ci git-subtree-split: 38a9cda825c6f0de518782a9a7e98254d62c44ce |
||
---|---|---|
.ci | ||
admin | ||
charts | ||
docs | ||
scripts | ||
.gitattributes | ||
.gitignore | ||
.helmdocsignore | ||
.versionrc | ||
CHANGELOG.md | ||
cliff.toml | ||
Dockerfile | ||
ecr_public_lifecycle.py | ||
LICENSE.md | ||
Makefile | ||
README.md | ||
renovate.json |
KubeZero - Zero Down Time Kubernetes platform
KubeZero is a Kubernetes distribution providing an integrated container platform so you can focus on your applications.
Design philosophy
- Cloud provider agnostic, bare-metal/self-hosted
- Focus on security and simplicity over feature creep
- No vendor lock in, most components are optional and could be easily exchanged
- Organic Open Source / open and permissive licenses over closed-source solutions
- No premium services / subscriptions required
- Staying up to date and contributing back to upstream projects, like alpine-cloud-images and others
- Corgi approved 🐶
Architecture
Version / Support Matrix
KubeZero releases track the same minor version of Kubernetes.
Any 1.26.X-Y release of Kubezero supports any Kubernetes cluster 1.26.X.
KubeZero is distributed as a collection of versioned Helm charts, allowing custom upgrade schedules and module versions as needed.
%%{init: {'theme':'dark'}}%%
gantt
title KubeZero Support Timeline
dateFormat YYYY-MM-DD
section 1.24
beta :124b, 2022-11-14, 2022-12-31
release :after 124b, 2023-06-01
section 1.25
beta :125b, 2023-03-01, 2023-03-31
release :after 125b, 2023-08-01
section 1.26
beta :126b, 2023-06-01, 2023-06-30
release :after 126b, 2023-10-01
Components
OS
- all nodes are based on Alpine V3.17
- 2 GB encrypted root filesystem
- no 3rd party dependencies at boot ( other than container registries )
- minimal attack surface
- extremely small memory footprint / overhead
Container runtime
- cri-o rather than Docker for improved security and performance
Control plane
- all Kubernetes components compiled against Alpine OS using
buildmode=pie
- support for single node control plane for small clusters / test environments to reduce costs
- access to control plane from within the VPC only by default ( VPN access required for Admin tasks )
- controller nodes are used for various platform admin controllers / operators to reduce costs and noise on worker nodes
GitOps
- cli / cmd line install
- optional full ArgoCD support and integration
- fuse device plugin support to build containers as part of a CI pipeline leveraging rootless podman build agents
AWS integrations
- IAM roles for service accounts allowing each pod to assume individual IAM roles
- access to meta-data services is blocked all workload containers on all nodes
- all IAM roles are maintained via CloudBender automation
- aws-node-termination handler integrated
- support for spot instances per worker group incl. early draining etc.
- support for Inf1 instances part of AWS Neuron.
Network
- Cilium using Geneve encapsulation, incl. increased MTU allowing flexible / more containers per worker node compared to eg. AWS VPC CNI
- Multus support for multiple network interfaces per pod, eg. additional AWS CNI
- no restrictions on IP space / sizing from the underlying VPC architecture
Storage
- flexible EBS support incl. zone awareness
- EFS support via automated EFS provisioning for worker groups via CloudBender automation
- local storage provider (OpenEBS LVM) for latency sensitive high performance workloads
- CSI Snapshot controller and Gemini snapshot groups and retention
Ingress
- AWS Network Loadbalancer and Istio Ingress controllers
- no additional costs per exposed service
- real client source IP available to workloads via HTTP header and access logs
- ACME SSL Certificate handling via cert-manager incl. renewal etc.
- support for TCP services
- optional rate limiting support
- optional full service mesh
Metrics
- Prometheus support for all components, incl. out of cluster EC2 instances (node_exporter)
- automated service discovery allowing instant access to common workload metrics
- pre-configured Grafana dashboards and alerts
- Alertmanager events via SNSAlertHub to Slack, Google, Matrix, etc.
Logging
- all container logs are enhanced with Kubernetes and AWS metadata to provide context for each message
- flexible ElasticSearch setup, leveraging the ECK operator, for easy maintenance & minimal admin knowledge required, incl. automated backups to S3
- Kibana allowing easy search and dashboards for all logs, incl. pre configured index templates and index management
- fluentd-concerter service providing queuing during highload as well as additional parsing options
- lightweight fluent-bit agents on each node requiring minimal resources forwarding logs secure via TLS to fluentd-concenter