master #16
|
@ -2,7 +2,7 @@ apiVersion: v2
|
|||
name: kubezero-metrics
|
||||
description: KubeZero Umbrella Chart for prometheus-operator
|
||||
type: application
|
||||
version: 0.1.3
|
||||
version: 0.1.4
|
||||
home: https://kubezero.com
|
||||
icon: https://cdn.zero-downtime.net/assets/kubezero/logo-small-64.png
|
||||
keywords:
|
||||
|
@ -16,7 +16,7 @@ dependencies:
|
|||
version: ">= 0.1.3"
|
||||
repository: https://zero-down-time.github.io/kubezero/
|
||||
- name: prometheus-operator
|
||||
version: 9.3.0
|
||||
version: 9.3.1
|
||||
repository: https://kubernetes-charts.storage.googleapis.com/
|
||||
- name: prometheus-adapter
|
||||
version: 2.5.0
|
||||
|
|
|
@ -27,9 +27,11 @@ prometheus-operator:
|
|||
kubeProxy:
|
||||
enabled: true
|
||||
|
||||
# Disabled until we figure out how to scrape etcd with ssl client certs
|
||||
kubeEtcd:
|
||||
enabled: false
|
||||
enabled: true
|
||||
service:
|
||||
port: 2381
|
||||
targetPort: 2381
|
||||
|
||||
kubeControllerManager:
|
||||
enabled: true
|
||||
|
|
|
@ -0,0 +1,15 @@
|
|||
# api-server OAuth configuration
|
||||
|
||||
## Update Api-server config
|
||||
Add the following extraArgs to the ClusterConfiguration configMap in the kube-system namespace:
|
||||
`kubectl edit -n kube-system cm kubeadm-config`
|
||||
|
||||
```
|
||||
oidc-issuer-url: "https://accounts.google.com"
|
||||
oidc-client-id: "<CLIENT_ID from Google>"
|
||||
oidc-username-claim: "email"
|
||||
oidc-groups-claim: "groups"
|
||||
```
|
||||
|
||||
## Resources
|
||||
- https://kubernetes.io/docs/reference/access-authn-authz/authentication/
|
|
@ -0,0 +1,9 @@
|
|||
# Cluster Operations
|
||||
|
||||
## Clean up
|
||||
### Delete evicted pods across all namespaces
|
||||
|
||||
`kubectl get pods --all-namespaces -o json | jq '.items[] | select(.status.reason!=null) | select(.status.reason | contains("Evicted")) | "kubectl delete pods \(.metadata.name) -n \(.metadata.namespace)"' | xargs -n 1 bash -c
|
||||
`
|
||||
### Cleanup old replicasets
|
||||
`kubectl get rs --all-namespaces | awk {' if ($3 == 0 && $4 == 0) system("kubectl delete rs "$2" --namespace="$1)'}`
|
|
@ -0,0 +1,21 @@
|
|||
# kubectl
|
||||
kubectl is the basic cmdline tool to interact with any kubernetes cluster via the kube-api server.
|
||||
|
||||
## Plugins
|
||||
As there are various very useful plugins for kubectl the first thing should be to install *krew* the plugin manager.
|
||||
See: https://github.com/kubernetes-sigs/krew for details
|
||||
|
||||
List of awesome plugins: https://github.com/ishantanu/awesome-kubectl-plugins
|
||||
|
||||
### kubelogin
|
||||
To login / authenticate against an openID provider like Google install the kubelogin plugin.
|
||||
See: https://github.com/int128/kubelogin
|
||||
|
||||
Make sure to adjust your kubeconfig files accordingly !
|
||||
|
||||
### kauthproxy
|
||||
Easiest way to access the Kubernetes dashboard, if installed in the targeted cluster, is to use the kauthproxy plugin.
|
||||
See: https://github.com/int128/kauthproxy
|
||||
Once installed simply execute:
|
||||
`kubectl auth-proxy -n kubernetes-dashboard https://kubernetes-dashboard.svc`
|
||||
and access the dashboard via the automatically opened browser window.
|
|
@ -0,0 +1,26 @@
|
|||
## Security - Todo
|
||||
- https://github.com/freach/kubernetes-security-best-practice
|
||||
- https://github.com/aquasecurity/kube-bench
|
||||
- https://kubernetes.io/docs/tasks/debug-application-cluster/audit/
|
||||
- https://kubernetes.io/docs/tasks/debug-application-cluster/falco/
|
||||
|
||||
## Performance - Todo
|
||||
- https://kubernetes.io/docs/tasks/administer-cluster/limit-storage-consumption/
|
||||
|
||||
- Set priorityclasses and proper CPU/MEM limits for core pods like api-server etc. as we host additional services on the master nodes which might affect these critical systems
|
||||
see: https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/
|
||||
|
||||
## Storage - Todo
|
||||
- OpenSource S3 - https://min.io/
|
||||
- LinStore - DRDB for K8s - https://vitobotta.com/2020/01/04/linstor-storage-the-kubernetes-way/, https://github.com/kvaps/kube-linstor, https://github.com/piraeusdatastore/piraeus
|
||||
- ChubaoFS - CephFS competitor
|
||||
|
||||
# Monitoring
|
||||
- https://github.com/cloudworkz/kube-eagle
|
||||
|
||||
## Cleanup - Todo
|
||||
Something along the lines of https://github.com/onfido/k8s-cleanup which doesnt work as is
|
||||
|
||||
## Resources
|
||||
- https://docs.google.com/spreadsheets/d/1WPHt0gsb7adVzY3eviMK2W8LejV0I5m_Zpc8tMzl_2w/edit#gid=0
|
||||
- https://github.com/ishantanu/awesome-kubectl-plugins
|
|
@ -0,0 +1,15 @@
|
|||
# Operational guide for worker nodes
|
||||
|
||||
## Replace worker node
|
||||
In order to change the instance type or in genernal replace worker nodes do:
|
||||
|
||||
* (optional) Update the launch configuration of the worker group
|
||||
|
||||
* Make sure there is enough capacity in the cluster to handle all pods being evicted for the node
|
||||
|
||||
* `kubectl drain --ignore-daemonsets node_name`
|
||||
will evict all pods except DaemonSets. In case there are pods with local storage review each affected pod. After being sure no important data will be lost add `--delete-local-data` to the original command above and try again.
|
||||
|
||||
* Terminate instance matching *node_name*
|
||||
|
||||
The new instance should take over the previous node_name assuming only node is being replaced at a time and automatically join and replace the previous node.
|
Loading…
Reference in New Issue