Minor tweaks and doc updates
This commit is contained in:
parent
8f501b38c9
commit
82194a6bdb
@ -7,7 +7,7 @@
|
|||||||
- cluster-autoscaler is enabled by default on AWS
|
- cluster-autoscaler is enabled by default on AWS
|
||||||
- worker nodes are now automatically update to latest AMI and config in a rolling fashion
|
- worker nodes are now automatically update to latest AMI and config in a rolling fashion
|
||||||
- integrated Bitnami Sealed Secrets controller
|
- integrated Bitnami Sealed Secrets controller
|
||||||
|
- reduced avg. CPU load on controller nodes well below the 20% threshold to prevent extra costs from CPU credits
|
||||||
|
|
||||||
## Version upgrades
|
## Version upgrades
|
||||||
- cilium
|
- cilium
|
||||||
@ -37,15 +37,29 @@ Ensure your Kube context points to the correct cluster !
|
|||||||
3. Trigger cluster upgrade:
|
3. Trigger cluster upgrade:
|
||||||
`./admin/upgrade_cluster.sh <path to the argocd app kubezero yaml for THIS cluster>`
|
`./admin/upgrade_cluster.sh <path to the argocd app kubezero yaml for THIS cluster>`
|
||||||
|
|
||||||
4. Reboot controller(s) one by one
|
4. Review the kubezero-config and if all looks good commit the ArgoApp resouce for Kubezero via regular git
|
||||||
|
git add / commit / push `<cluster/env/kubezero/application.yaml>`
|
||||||
|
* DO NOT yet re-enable ArgoCD before all pre v1.24 workers have been replaced !!! *
|
||||||
|
|
||||||
|
5. Reboot controller(s) one by one
|
||||||
Wait each time for controller to join and all pods running.
|
Wait each time for controller to join and all pods running.
|
||||||
Might take a while ...
|
Might take a while ...
|
||||||
|
|
||||||
5. Upgrade CFN stacks for the workers.
|
6. Upgrade CFN stacks for the workers.
|
||||||
This in turn will trigger automated worker updates by evicting pods and launching new workers in a rolling fashion.
|
This in turn will trigger automated worker updates by evicting pods and launching new workers in a rolling fashion.
|
||||||
Grab a coffee and keep an eye on the cluster to be safe ...
|
Grab a coffee and keep an eye on the cluster to be safe ...
|
||||||
|
Depending on your cluster size it might take a while to roll over all workers!
|
||||||
|
|
||||||
6. If all looks good, commit the ArgoApp resouce for Kubezero, before re-enabling ArgoCD itself.
|
7. Re-enable ArgoCD by hitting <return> on the still waiting upgrade script
|
||||||
git add / commit / push `<cluster/env/kubezero/application.yaml>`
|
|
||||||
|
|
||||||
7. Head over to ArgoCD and sync all KubeZero modules incl. `pruning` enabled to remove eg. Calico
|
8. Quickly head over to ArgoCD and sync the KubeZero main module as soon as possible to reduce potential back and forth in case ArgoCD has legacy state
|
||||||
|
|
||||||
|
|
||||||
|
## Known issues
|
||||||
|
|
||||||
|
### existing EFS volumes
|
||||||
|
If pods are getting stuck in `Pending` during the worker upgrade, check the status of any EFS PVC.
|
||||||
|
In case any PVC is in status `Lost`, edit the PVC and remove the following annotation:
|
||||||
|
``` pv.kubernetes.io/bind-completed: "yes" ```
|
||||||
|
This will instantly rebind the PVC to its PV and allow the pods to migrate.
|
||||||
|
Going to be fixed during the v1.25 cycle by a planned rework of the EFS storage module.
|
||||||
|
Loading…
Reference in New Issue
Block a user