chore: update docs

This commit is contained in:
Stefan Reimer 2021-07-01 16:42:39 +02:00
parent 3ee27d7da5
commit eab6f8253c
6 changed files with 76 additions and 74 deletions

View File

@ -13,12 +13,14 @@ KubeZero is a Kubernetes distribution providing an integrated container platform
# Version / Support Matrix
| KubeZero \ Kubernetes Version | v1.18 | v1.19 | v1.20 | EOL |
| KubeZero \ Kubernetes Version | v1.19 | v1.20 | v1.21 | EOL |
|----------------------------------------|-------|-------|-------|-------------|
| master branch | yes | yes | beta | |
| stable branch | yes | yes | no | |
| v2.19.0 | yes | yes | no | 30 Jun 2021 |
| v2.18.0 | yes | no | no | 30 Apr 2021 |
| master branch | no | yes | alpha | |
| stable branch | yes | no | no | |
| v2.20.0 | no | yes | no | 30 Aug 2021 |
| v2.19.0 | yes | no | no | 30 Aug 2021 |
[Upstream release policy](https://kubernetes.io/releases/)
# Architecure
![aws_architecture](docs/aws_architecture.png)

View File

@ -1,4 +1,3 @@
{{- if .Values.disabledfor120 }}
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
@ -7,4 +6,3 @@ handler: runc
overhead:
podFixed:
memory: 16Mi
{{- end }}

View File

@ -1,6 +1,6 @@
# kubezero-logging
![Version: 0.6.5](https://img.shields.io/badge/Version-0.6.5-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.5.0](https://img.shields.io/badge/AppVersion-1.5.0-informational?style=flat-square)
![Version: 0.7.0](https://img.shields.io/badge/Version-0.7.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.6.0](https://img.shields.io/badge/AppVersion-1.6.0-informational?style=flat-square)
KubeZero Umbrella Chart for complete EFK stack
@ -18,9 +18,9 @@ Kubernetes: `>= 1.18.0`
| Repository | Name | Version |
|------------|------|---------|
| | eck-operator | 1.5.0 |
| | fluent-bit | 0.15.4 |
| | fluentd | 0.2.2 |
| | eck-operator | 1.6.0 |
| | fluent-bit | 0.15.14 |
| | fluentd | 0.2.6 |
| https://zero-down-time.github.io/kubezero/ | kubezero-lib | >= 0.1.3 |
## Changes from upstream
@ -108,9 +108,9 @@ Kubernetes: `>= 1.18.0`
| fluentd.fileConfigs."00_system.conf" | string | `"<system>\n root_dir /var/log/fluentd\n # log_level debug\n workers 2\n</system>"` | |
| fluentd.fileConfigs."01_sources.conf" | string | `"<source>\n @type http\n @label @KUBERNETES\n port 9880\n bind 0.0.0.0\n keepalive_timeout 30\n</source>\n\n<source>\n @type forward\n @label @KUBERNETES\n port 24224\n bind 0.0.0.0\n # skip_invalid_event true\n send_keepalive_packet true\n <security>\n self_hostname \"#{ENV['HOSTNAME']}\"\n shared_key {{ .Values.shared_key }}\n </security>\n</source>"` | |
| fluentd.fileConfigs."02_filters.conf" | string | `"<label @KUBERNETES>\n # prevent log feedback loops eg. ES has issues etc.\n # discard logs from our own pods\n <match kube.logging.fluentd>\n @type relabel\n @label @FLUENT_LOG\n </match>\n\n <match **>\n @type relabel\n @label @DISPATCH\n </match>\n</label>"` | |
| fluentd.fileConfigs."04_outputs.conf" | string | `"<label @OUTPUT>\n <match **>\n @id out_es\n @type elasticsearch\n @log_level info\n include_tag_key true\n id_key id\n remove_keys id\n\n # KubeZero pipeline incl. GeoIP etc.\n pipeline fluentd\n\n hosts \"{{ .Values.output.host }}\"\n port 9200\n scheme http\n user elastic\n password \"#{ENV['OUTPUT_PASSWORD']}\"\n\n log_es_400_reason\n logstash_format true\n reconnect_on_error true\n reload_on_failure true\n request_timeout 60s\n suppress_type_name true\n slow_flush_log_threshold 40.0\n # bulk_message_request_threshold 2097152\n\n <buffer tag>\n @type file_single\n chunk_limit_size 16MB\n total_limit_size 4GB\n flush_mode interval\n flush_thread_count 2\n flush_interval 10s\n flush_at_shutdown true\n retry_type exponential_backoff\n retry_timeout 2h\n flush_thread_interval 30s\n overflow_action drop_oldest_chunk\n disable_chunk_backup true\n </buffer>\n </match>\n</label>"` | |
| fluentd.fileConfigs."04_outputs.conf" | string | `"<label @OUTPUT>\n <match **>\n @id out_es\n @type elasticsearch\n @log_level info\n include_tag_key true\n id_key id\n remove_keys id\n\n # KubeZero pipeline incl. GeoIP etc.\n pipeline fluentd\n\n hosts \"{{ .Values.output.host }}\"\n port 9200\n scheme http\n user elastic\n password \"#{ENV['OUTPUT_PASSWORD']}\"\n\n log_es_400_reason\n logstash_format true\n reconnect_on_error true\n reload_on_failure true\n request_timeout 60s\n suppress_type_name true\n slow_flush_log_threshold 50.0\n\n # Retry failed bulk requests\n # https://github.com/uken/fluent-plugin-elasticsearch#unrecoverable-error-types\n unrecoverable_error_types [\"out_of_memory_error\"]\n bulk_message_request_threshold 2097152\n\n <buffer>\n @type file\n\n flush_mode interval\n flush_thread_count 1\n flush_interval 30s\n\n chunk_limit_size 4MB\n total_limit_size 2GB\n\n flush_at_shutdown true\n retry_type exponential_backoff\n retry_timeout 2h\n overflow_action drop_oldest_chunk\n disable_chunk_backup true\n </buffer>\n </match>\n</label>"` | |
| fluentd.image.repository | string | `"fluent/fluentd-kubernetes-daemonset"` | |
| fluentd.image.tag | string | `"v1.12-debian-elasticsearch7-1"` | |
| fluentd.image.tag | string | `"v1-debian-elasticsearch"` | |
| fluentd.istio.enabled | bool | `false` | |
| fluentd.kind | string | `"Deployment"` | |
| fluentd.metrics.serviceMonitor.additionalLabels.release | string | `"metrics"` | |
@ -141,7 +141,7 @@ Kubernetes: `>= 1.18.0`
| kibana.istio.enabled | bool | `false` | |
| kibana.istio.gateway | string | `"istio-system/ingressgateway"` | |
| kibana.istio.url | string | `""` | |
| version | string | `"7.11.1"` | |
| version | string | `"7.13.2"` | |
## Resources:

View File

@ -18,8 +18,8 @@ Kubernetes: `>= 1.18.0`
| Repository | Name | Version |
|------------|------|---------|
| | kube-prometheus-stack | 15.4.4 |
| https://prometheus-community.github.io/helm-charts | prometheus-adapter | 2.12.3 |
| | kube-prometheus-stack | 16.12.0 |
| https://prometheus-community.github.io/helm-charts | prometheus-adapter | 2.14.2 |
| https://zero-down-time.github.io/kubezero/ | kubezero-lib | >= 0.1.3 |
## Values
@ -67,7 +67,6 @@ Kubernetes: `>= 1.18.0`
| kube-prometheus-stack.grafana.service.portName | string | `"http-grafana"` | |
| kube-prometheus-stack.grafana.sidecar.dashboards.provider.foldersFromFilesStructure | bool | `true` | |
| kube-prometheus-stack.grafana.sidecar.dashboards.searchNamespace | string | `"ALL"` | |
| kube-prometheus-stack.grafana.sidecar.image.tag | string | `"1.12.0"` | |
| kube-prometheus-stack.grafana.testFramework.enabled | bool | `false` | |
| kube-prometheus-stack.kube-state-metrics.nodeSelector."node-role.kubernetes.io/master" | string | `""` | |
| kube-prometheus-stack.kube-state-metrics.podSecurityPolicy.enabled | bool | `false` | |
@ -110,6 +109,7 @@ Kubernetes: `>= 1.18.0`
| kube-prometheus-stack.prometheus.prometheusSpec.resources.requests.cpu | string | `"500m"` | |
| kube-prometheus-stack.prometheus.prometheusSpec.resources.requests.memory | string | `"512Mi"` | |
| kube-prometheus-stack.prometheus.prometheusSpec.retention | string | `"8d"` | |
| kube-prometheus-stack.prometheus.prometheusSpec.ruleSelectorNilUsesHelmValues | bool | `false` | |
| kube-prometheus-stack.prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues | bool | `false` | |
| kube-prometheus-stack.prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.accessModes[0] | string | `"ReadWriteOnce"` | |
| kube-prometheus-stack.prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage | string | `"16Gi"` | |
@ -132,8 +132,8 @@ Kubernetes: `>= 1.18.0`
| prometheus-adapter.prometheus.url | string | `"http://metrics-kube-prometheus-st-prometheus"` | |
| prometheus-adapter.rules.default | bool | `false` | |
| prometheus-adapter.rules.resource.cpu.containerLabel | string | `"container"` | |
| prometheus-adapter.rules.resource.cpu.containerQuery | string | `"sum(irate(container_cpu_usage_seconds_total{<<.LabelMatchers>>,container!=\"POD\",container!=\"\",pod!=\"\"}[3m])) by (<<.GroupBy>>)"` | |
| prometheus-adapter.rules.resource.cpu.nodeQuery | string | `"sum(1 - irate(node_cpu_seconds_total{mode=\"idle\"}[3m]) * on(namespace, pod) group_left(node) node_namespace_pod:kube_pod_info:{<<.LabelMatchers>>}) by (<<.GroupBy>>)"` | |
| prometheus-adapter.rules.resource.cpu.containerQuery | string | `"sum(irate(container_cpu_usage_seconds_total{<<.LabelMatchers>>,container!=\"POD\",container!=\"\",pod!=\"\"}[5m])) by (<<.GroupBy>>)"` | |
| prometheus-adapter.rules.resource.cpu.nodeQuery | string | `"sum(1 - irate(node_cpu_seconds_total{mode=\"idle\"}[5m]) * on(namespace, pod) group_left(node) node_namespace_pod:kube_pod_info:{<<.LabelMatchers>>}) by (<<.GroupBy>>)"` | |
| prometheus-adapter.rules.resource.cpu.resources.overrides.namespace.resource | string | `"namespace"` | |
| prometheus-adapter.rules.resource.cpu.resources.overrides.node.resource | string | `"node"` | |
| prometheus-adapter.rules.resource.cpu.resources.overrides.pod.resource | string | `"pod"` | |
@ -143,15 +143,13 @@ Kubernetes: `>= 1.18.0`
| prometheus-adapter.rules.resource.memory.resources.overrides.namespace.resource | string | `"namespace"` | |
| prometheus-adapter.rules.resource.memory.resources.overrides.node.resource | string | `"node"` | |
| prometheus-adapter.rules.resource.memory.resources.overrides.pod.resource | string | `"pod"` | |
| prometheus-adapter.rules.resource.window | string | `"3m"` | |
| prometheus-adapter.rules.resource.window | string | `"5m"` | |
| prometheus-adapter.tolerations[0].effect | string | `"NoSchedule"` | |
| prometheus-adapter.tolerations[0].key | string | `"node-role.kubernetes.io/master"` | |
# Dashboards
## Etcs
- https://grafana.com/grafana/dashboards/3070
## ElasticSearch
- https://grafana.com/grafana/dashboards/266
## Alertmanager
- https://grafana.com/api/dashboards/9578/revisions/4/download
## Prometheus
- https://grafana.com/api/dashboards/3662/revisions/2/download

View File

@ -1,6 +1,6 @@
# kubezero-redis
![Version: 0.2.1](https://img.shields.io/badge/Version-0.2.1-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square)
![Version: 0.2.2](https://img.shields.io/badge/Version-0.2.2-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square)
KubeZero Umbrella Chart for Redis HA

View File

@ -1,68 +1,71 @@
# Upgrade to KubeZero V2.20 / Kubernetes 1.20
# CloudBender
## Changes
### Single node control plane
- Control
## Upgrade
- Set the specific wanted Kubernetes version in the controller config to eg. `v1.20.2`
- configure your AWS CLI profile as well as your kubectl context to cluster you want to upgrade.
- verify your config ...
- run ./scripts/upgrade_120.sh
- update the CFN stack for kube-control-plane
### Single node control plane
- will automatically be upgraded and the controller node replaced as part of the CFN update
### Clustered control plane
- replace controller instances one by one in no particular order
- once confirmed that the upgraded 1.20 control plane is working as expected update the clustered control plane CFN stack once more with `LBType: none` to remove the AWS NLB fronting the Kubernetes API which is not required anymore.
- replace worker nodes in a rolling fashion via. drain / terminate / rinse-repeat
# KubeZero
# KubeZero V2.20 / Kubernetes 1.20
## New features
- Support for [Service Account Tokens](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#service-account-token-volume-projection) incl. federation with AWS IAM
This allows pods to assume IAM roles without the need of additional services like kiam.
- Cert-manager integration now supports [cross-account issuer](https://cert-manager.io/docs/configuration/acme/dns01/route53/#cross-account-access) for AWS route53
- Optional Proxy Protocol support for Ingress Loadbalancers, which allows preserving the real client IP and at the same time solves the hairpin routing issues of the AWS NLBs, see [Istio blog](https://istio.io/v1.9/blog/2020/show-source-ip/)
### NATS
Deploy NATS services
## New modules
### MQ / NATS
Deploy [NATS](https://docs.nats.io/jetstream/jetstream) services incl. jetstream engine, Grafana dashboards etc.
### TimeCapsule
Providing backup solutions for KubeZero clusters:
- scheduled snapshots for EBS backed PVCs incl. custome retention and restore
Provides backup solutions for KubeZero clusters, like
Scheduled snapshots for EBS backed PVCs incl. custom retention and restore.
## Changes
## Changelog
### General
- various version bumps
- removed deprecated nodeLabels from `failure-domain.beta.kubernetes.io/[region|zone]` to `topology.kubernetes.io/[region|zone]` please adapt existing affinity rules !
- version bumps of all modules
- cert-manager, ebs-csi and efs-csi driver now leverage service account tokens and do not rely on kiam anymore
### Logging
- version bumps for ElasticSearch, Kibana, ECK, fluentd and fluent-bit
- various fixes and tuning to improve reliability of the fluentd aggregator layer
### Istio
- hardened and optimized settings for Envoy gateway proxies
- improved deployment strategy to reduce errors during upgrades
- Added various Grafana Dashboards
- version bump to 1.10.2
## Metrics
### Metrics
- Added various dashboards for KubeZero modules
- Updated / improved dashboard organization incl. folders and tags
- Grafana Dashboards are now all provided via configmaps, no more state required, no manual changes persisted
- Grafana allows anonymous read-only access
- all dashboards ndefault to now-1h and prohibit less than 30s refresh
- Grafana Dashboards are now all provided via configmaps, no more state required, also no more manual changes persisted
- Grafana now allows anonymous read-only access
- all dashboards default to `now-1h` and prohibit less than 30s refresh cycles
- Custom dashboards can easily be provided by simple installing a ConfigMap along with workloads in any namespace
## Upgrade - Without ArgoCD
# Upgrade - CloudBender
- Set the specific wanted Kubernetes version in the controller config to eg. `v1.20.8`
- configure your AWS CLI profile as well as your kubectl context to point to the cluster you want to upgrade
and verify your config via `aws sts get-caller-identity` and `kubectl cluster-info`
- run `./scripts/upgrade_120.sh`
- update the CFN stack kube-control-plane for your cluster
### Single node control plane
- a new controller instance will automatically be launched and replace the current controller as part of the CFN update
### Clustered control plane
- replace controller instances one by one in no particular order
- once confirmed that the upgraded 1.20 control plane is working as expected update the clustered control plane CFN stack once more with `LBType: none` to remove the AWS NLB fronting the Kubernetes API which is not required anymore.
## Upgrade Cloudbender continue
- upgrade all `kube-worker*` CFN stacks
- replace worker nodes in a rolling fashion via. drain / terminate and rinse-repeat
# Upgrade KubeZero
1. Update CRDs of all enabled components:
`./bootstrap.sh crds all clusters/$CLUSTER`
`./bootstrap.sh crds all clusters/$CLUSTER`
2. Prepare upgrade
- Remove legacy monitoring configmaps
- Remove previous Grafana stateful config
- Remove legacy Istio Enovyfilter
- Remove legacy Istio Envoyfilter
```
kubectl delete cm -n monitoring -l grafana_dashboard=1
@ -70,13 +73,14 @@ kubectl delete pvc metrics-grafana -n monitoring
kubectl delete envoyfilter -A -l operator.istio.io/version=1.6.9
```
3. Upgrade all components
`./bootstrap.sh deploy all clusters/$CLUSTER`
3. Upgrade all KubeZero modules:
- without ArgoCD:
- `./bootstrap.sh deploy all clusters/$CLUSTER`
- with ArgoCD:
## Upgrade - ArgoCD
- ArgoCD itself: `./bootstrap.sh deploy argocd clusters/$CLUSTER`
- push latest cluster config to your git repo
- trigger sync in ArgoCD incl. *prune* starting with the KubeZero root app
- ArgoCD itself: `./bootstrap.sh deploy argocd clusters/$CLUSTER`
- push latest cluster config to your git repo
- trigger sync in ArgoCD incl. *prune* starting with the KubeZero root app
( only if auto-sync is not enabled )
## Verification / Tests