Commit Graph

558 Commits

Author SHA1 Message Date
1218033166 Further tuning of fluentd throughput 2021-02-22 21:34:45 +01:00
eb4f22c5c2 Fix kubelet config 2021-02-22 21:32:41 +01:00
62a8f82f01 Version bump cert-manager 2021-02-22 21:32:12 +01:00
d969e53d40 Make kubeadm config work on bare-metal, minor tuning 2021-02-22 14:41:32 +01:00
8e8f747686 Kubeadm chart for 1.19, improved tooling 2021-02-12 11:04:16 +00:00
19d10828f6 README updates 2021-01-26 13:47:33 +00:00
fc45e7fd0b Istio minor version bump 2021-01-26 12:54:56 +00:00
9ca8920387 Fix changed key for kiam 2021-01-21 13:35:20 +00:00
7587564da0 Version bump for aws-ebs-csi and kiam, ES bugfix bump, fluentd tuning 2021-01-21 12:31:06 +00:00
d28e18766a CI/CD tools update 2021-01-21 10:53:53 +00:00
adefd7433b Reduce logLevel of prometheus adapter 2021-01-20 15:31:00 +00:00
da6a1fdf51 Reduce loglevel of prometheus adapter 2021-01-20 15:22:28 +00:00
a26b652690 Allow custom memory overwrites for ES cluster 2021-01-18 17:18:30 +00:00
d7091434db Add basic mapping for aws-iam-auth 2021-01-11 20:41:12 +00:00
ce7645cb57 Split out crds for aws-iam-authenticator 2021-01-04 18:13:36 +00:00
4fe40a1345 Add aws-iam-authenticator support 2021-01-04 14:56:41 +00:00
924310ca5b Remove stable repo 2021-01-03 16:33:13 +01:00
67f1157848 Integrate and patch prometheus-stack chart to customize alerts 2020-12-17 16:46:15 -08:00
4892d6c073 Switch to gp3 as default EBS class, version bump for metrics components 2020-12-17 15:36:23 -08:00
fdcb6f7e6f Remove repositories to make argo happy 2020-12-17 12:24:12 -08:00
214bfec2a4 Remove repositories to make argo happy 2020-12-17 12:22:48 -08:00
38b2d56da9 Re-add fluentd chart until we migrate off 2020-12-17 12:17:19 -08:00
521bb2a5c1 Istio version bump, ingress terminationgraceperiod patch, aws-ebs version bump 2020-12-16 03:40:14 -08:00
79dc6e9413 EBS driver version bump 2020-12-10 07:06:31 -08:00
3820858046 More logging tuning 2020-12-10 06:44:58 -08:00
89dc890c74 More logging tuning 2020-12-10 06:36:26 -08:00
a8314d4074 Lua fix fluent-bit 2020-12-08 07:15:00 -08:00
77a7ba2ed6 Integrare fluent-bit into logging to allow better config 2020-12-08 07:05:25 -08:00
f78c382be6 Use upstream released chart for aws-ebs-csi 2020-12-07 15:01:40 -08:00
5909fcd841 Fix empty CRDs, only deploy eck-operator if needed 2020-12-07 13:06:00 -08:00
835aae9df8 Re-enable geoip lookups 2020-12-07 04:33:33 -08:00
b30b41ab15 Disable CRDs from eck-operator defaults 2020-12-05 14:16:33 -08:00
a31a945094 Adjust argo ingnores for latest eck webhooks 2020-12-05 14:08:40 -08:00
2a56489273 ECK fixes for Kube 1.18, Redis cluster support incl. Enyoy proxy 2020-12-04 06:05:35 -08:00
33495c83de Add helm version check to bootstrap.sh 2020-12-03 02:04:08 -08:00
8fcbcb680b Minor version bump for redis, added redis-cluster support 2020-12-02 07:23:17 -08:00
83b9b566db Switch all metrics logs to json 2020-12-02 06:24:07 -08:00
89780039fc Fix service names in metrics 2020-12-02 04:30:17 -08:00
ee83391296 Add alertmanager istio config for metrics, metrics values reorg 2020-12-02 03:53:19 -08:00
a510dd06d9 More fixes and upgrade docs 2020-12-01 07:46:04 -08:00
2be387b87b ArgoCd naming fixes 2020-11-30 09:30:06 -08:00
0e7a2e70d6 More fixes 2020-11-30 04:13:52 -08:00
b0f53257ac cert-manager version bump, local-path-provisioner fixes 2020-11-30 11:34:44 +00:00
59ff3cb015 Add local-path-provisioner, re-org bootstrap 2020-11-30 01:52:11 -08:00
4b48da5935 Metrics update 2020-11-28 23:54:40 +00:00
e27692430e More bugfixes, ingress certs 2020-11-28 15:01:20 -08:00
09d2b52f74 More fixes 2020-11-27 08:19:44 -08:00
10db0f09d0 Add missing .helmignore 2020-11-26 15:31:40 -08:00
052efd077c Latest fixes, fluent-bit version bump 2020-11-26 09:37:10 -08:00
c8a903110f More fixes now adding ArgoCD 2020-11-26 05:21:10 -08:00
ec6d7a4d11 Another argo tweak 2020-11-24 07:29:38 -08:00
5b317db251 Bug fixes and argo tweaks 2020-11-24 07:18:14 -08:00
32ed7cf3a0 Revert Kube version check to make argo work 2020-11-24 06:51:48 -08:00
cd24b9fa1a First try adding argoCD day 2 2020-11-24 06:44:57 -08:00
35b1570d18 Update of various components, new aroless bootstrap working 2020-11-21 04:24:57 -08:00
cd0e559678 First steps of argoless bootstrap 2020-11-03 12:51:57 +00:00
b6929002dc Minor version bump for prometheus-stack, remove default CPU limit 2020-10-27 14:13:52 +00:00
53b638da5e Update docs, bump argo-cd parallel jobs 2020-10-27 11:54:44 +00:00
9e76512fcc Remove argocd from control plane 2020-10-21 14:18:02 +01:00
74c47d7391 Enable json logs for argo-cd finally 2020-10-21 13:29:49 +01:00
d7006faa60 Bump argo-cd chart version 2020-10-21 13:14:23 +01:00
4fb425676d Bump argo-cd version 2020-10-21 13:12:23 +01:00
8874c9869d Revert more prometheus-adapter config 2020-10-21 13:05:08 +01:00
72a917bdae Revert prometheus adapter changes 2020-10-21 12:51:15 +01:00
44d08c7abc More EFS fixes, cert-manager version bump 2020-10-21 04:37:33 -07:00
19d915cb92 Adjust prometheus URLs 2020-10-09 18:41:43 -07:00
509f8d59fb First stab at new prometheus charts 2020-10-09 17:58:44 -07:00
05993ab6b0 Cleanup 2020-10-09 12:38:20 -07:00
5781494eda Minor tweak to aws efs upate tooling 2020-10-09 11:15:19 -07:00
fea850afcc Actually update the default version of aws ebs to 0.7.0 2020-10-09 11:14:51 -07:00
004503d633 AWS EBS driver version bump 2020-10-09 10:53:32 -07:00
d7132ca90c Revert minimal kube version due to issues with argocd 2020-10-09 07:43:05 -07:00
959d61ef66 Add multi PV support to EFS 2020-10-09 07:30:25 -07:00
54335c4c0a Update EFS tooling to track releases 2020-10-08 07:52:34 -07:00
4285db835d Typo 2020-10-07 09:11:22 -07:00
a951e7d9a0 New Lua function to nest entries into kube.<namespace>.* 2020-10-07 09:09:24 -07:00
cb3c6a93ba fluent-bit tag improvements 2020-10-05 17:27:58 -07:00
b0286ff858 Add some spaces 2020-10-05 09:03:47 -07:00
846d7d2d87 More logging fixes, try to decode json at the source 2020-10-05 09:01:50 -07:00
42f8a5a0b5 Disable json logging, crashed Argo 2020-10-05 08:43:18 -07:00
31f86360d9 Revert ArgoCd 1.7.7 2020-10-05 08:27:37 -07:00
baa9b69265 Latest argocd 2020-10-05 04:31:00 -07:00
5854468f09 Derp 2020-10-05 04:09:03 -07:00
c556df65ff Updated helm-docs, fluentd SSL handled by Istio, ES&Istio tuning 2020-10-05 03:50:23 -07:00
4aeb23d8cc Disable borken json parsing for now 2020-10-02 14:46:07 -07:00
bbd6d25429 Disable borken json parsing for now 2020-10-02 14:41:40 -07:00
1aba6fcbe6 Fix the warning due to double CRDs 2020-10-02 10:44:15 -07:00
cd5b38bb6c Istio version bump, make http10 support optional, enable redis,mysql protocol support 2020-10-02 10:38:09 -07:00
4cb3bd01c5 Minor fluent-bit tuning 2020-10-01 12:32:21 -07:00
84a80f3b97 Fluentd tuning 2020-10-01 10:14:04 -07:00
ea2391a212 Fluentd tuning 2020-10-01 10:11:48 -07:00
fad0597302 Disable pipeline still cpu issues 2020-09-28 04:54:47 -07:00
8de44f18d4 Reenable fluentd ingest pipeline again 2020-09-28 04:45:39 -07:00
d30ca895ec Make the kiam annotate namespace job optional 2020-09-18 16:18:59 +01:00
0939405c7a Logging fixes for NOT using nameoverride 2020-09-18 16:12:52 +01:00
2c600c2fd0 Slightly allow ArgoCD a bit more processing 2020-09-18 14:21:39 +01:00
8af14e3e8e Bump argocd to 1.7.5 as 1.7.4 has a deadlock CPU issue 2020-09-18 13:09:18 +01:00
df20d07d10 Add EnvoyFilter to enable tcp keepalive for all Ingress Envoys 2020-09-17 22:25:09 +01:00
d61752703e Revert TCP keepalive for fluentd listener 2020-09-17 19:44:34 +01:00
bcf8093b84 Enable TCP keepalive for fluentd listener 2020-09-17 19:24:24 +01:00
ec18529956 TCP keepalive tuning for Istio 2020-09-17 17:54:57 +01:00
a0873631c4 Set global meshpolicy to prevent upgrade to http2 by default 2020-09-16 16:50:48 +01:00
0b2b5acff7 Another argocd resource tweak 2020-09-15 11:48:07 +01:00
628a7e7ac9 Introduce resources for at least the argocd controller 2020-09-15 11:15:55 +01:00
93723e6a6a Docs update 2020-09-14 17:26:39 +01:00
c9a5691acf fluent-bit version bump 2020-09-14 17:26:19 +01:00
2171a4211e New bootstrap flow 2020-09-14 16:06:53 +01:00
f9770ce483 Latest deploy bootstrap tweaks 2020-09-14 15:24:40 +01:00
189899c296 Disable default poddisruptionbudgets, replace with individual todo 2020-09-11 18:21:00 +01:00
94a0db6a80 Still double CRDs 2020-09-11 16:03:22 +01:00
8460310eb8 Istio is really picky 2020-09-11 16:01:15 +01:00
efdcbe741e Move ports > 1024 as we run non-root 2020-09-11 15:45:04 +01:00
47c96ba6c5 Set JSON for access logs 2020-09-11 15:39:47 +01:00
812da69ae3 Remove xp settings 2020-09-11 15:32:14 +01:00
990bf89eab Re-enable JSON access logs 2020-09-11 15:28:51 +01:00
b873f1389e Re-enable access logs 2020-09-11 15:22:34 +01:00
3c79677715 Remove deprecated fields for 1.7 2020-09-11 15:20:51 +01:00
a83f87ad15 Remove deprecated fields for 1.7 2020-09-11 15:18:30 +01:00
6c90669fd8 Remove deprecated fields for 1.7 2020-09-11 15:15:53 +01:00
3c7f1a8f74 Version bump Istio to 1.7.1 2020-09-11 15:06:38 +01:00
bf4deda82e Remove double CRD for Istio 2020-09-11 14:42:25 +01:00
428fa56b17 more istio cleanup 2020-09-11 12:37:22 +01:00
5cd030d0db more istio cleanup 2020-09-11 12:32:46 +01:00
90a5038d31 more istio cleanup 2020-09-11 12:23:08 +01:00
6015b4ee9b Minot istio tweaks 2020-09-11 12:08:58 +01:00
2577ba826c Fix math in resources calc 2020-09-11 11:07:49 +01:00
0c6f4d06e3 Make ES heap configurable, set resources accordingly 2020-09-11 11:00:51 +01:00
a3a1f0bb8f Istio version bump to 1.6.9 2020-09-10 16:44:49 +01:00
d4b6a78c3b Add resources to Kiam 2020-09-10 14:22:47 +01:00
13c81dab53 Another prometheus resources tweak to prevent being killed during restarts 2020-09-10 14:09:23 +01:00
ec1adab48e Adjust and limit Prometheus resources 2020-09-10 14:01:28 +01:00
2b8bf02f37 ArgoCD version bump to 1.7.4 2020-09-10 13:44:48 +01:00
7c08700e71 Change log tag for audit log to not collide with regular tags 2020-09-09 20:59:03 +01:00
862fb4be9d Calico version bump to 3.16.1 2020-09-09 14:17:02 +01:00
777fe64f01 Dont remove other fields for valid json 2020-09-08 15:41:20 +01:00
8217fdd623 Fix fluentd parsing of json 2020-09-08 15:34:16 +01:00
75002ce2eb Fix fluentd typo 2020-09-08 15:07:17 +01:00
b49a864cbb Update fluentd to latest quay.io image, add json parser for message 2020-09-08 15:05:31 +01:00
1546415746 Remove CRD property to fix OutofSync Argo 2020-09-08 13:44:31 +01:00
6be2f0697f Handle empty message events 2020-09-08 13:40:09 +01:00
d04e7fa0f1 Add Lua functions to reassemble partial cri-o logs 2020-09-08 13:12:21 +01:00
48045d7afc Fluent-bit version bump and support for api audit logs 2020-09-08 12:40:28 +01:00
88725c33be Move scrape username to its own secret as eck operator cleans up otherwise 2020-09-04 01:13:39 +01:00
b8dcdc89d3 Enabled scraping etcd 2020-09-02 15:05:57 +01:00
1fec29b05f More fluentd tuning 2020-08-27 01:13:34 +01:00
28e3ce4f8e Disable ingest pipeline until we know what breaks / jams in ES 2020-08-27 01:03:35 +01:00
3e55e27bf1 Revert ES fixes as servicemonitor is retarted 2020-08-26 23:02:47 +01:00
51d9dc48fc Hardcode es user for now 2020-08-26 22:50:51 +01:00
8b82972d06 fluentd / ES fixes 2020-08-26 18:13:21 +01:00
b376544424 Fix default value 2020-08-25 14:46:22 +01:00
982685aa4b Make Istio Ingress hosts specific matching the cert 2020-08-25 14:45:56 +01:00
6adcddf4d6 Revert default fluentd image, latest has issues 2020-08-24 11:38:47 +01:00
1f0d7fae29 Increase default read-timeout for fluentd 2020-08-23 17:47:28 +01:00
6a34a198f4 Use quay.io fluentd-es image until we roll our own 2020-08-23 17:41:37 +01:00
6620416047 Fix default fluentd hostname 2020-08-23 15:50:14 +01:00
256877b736 Remove Cri parser as it is already incl. upstream now 2020-08-22 19:24:58 +01:00
94622b4f9a Update docs 2020-08-22 18:27:31 +01:00
7310235fa2 Add fluent-bit support to kuberzero-logging, istio fixes 2020-08-22 18:27:18 +01:00
123d7ce946 Refactor argo apps factory 2020-08-21 20:39:55 +01:00
715e1d6c69 Switch istio ingress to http healthchecks, more tuning 2020-08-21 14:17:47 +01:00
f30df54c73 Also apply improved healthcehck handling and draining to public ingress 2020-08-20 18:32:01 +01:00
2f258a3194 Apply graceful shutdown fixes interim like Contour 2020-08-20 17:38:18 +01:00
be013b67ce Add graceful shutdown to Ingress gateway, might need istio 1.7 to actually work though 2020-08-20 16:55:47 +01:00
233e53c928 Set Istio idle timeout to 1h 2020-08-20 16:12:41 +01:00
96a6132a43 Set Istio idle timeout 2020-08-20 15:55:49 +01:00
f1ef778075 Disable default syncPolicy, use values instead 2020-08-20 11:40:08 +01:00
4fcf2c0ed3 Revert to default images 2020-08-18 13:13:30 +01:00
2f48198ffb Fix istio to service mapping 2020-08-18 12:45:15 +01:00
3e581471ff Revert ot quay image, disable plugins 2020-08-18 12:36:56 +01:00
09886b10b2 Disable persistence by default 2020-08-18 12:08:49 +01:00
695318eada Add missing fluentd secrets 2020-08-18 11:58:37 +01:00
51921f3d47 Disable statefulset for fluentd being broken upstream 2020-08-18 11:41:09 +01:00
d36bf246d6 Add fluentd to logging 2020-08-18 11:34:34 +01:00
05da44c191 Make old ECK resources optional 2020-08-17 13:12:07 +01:00
b2e6911ca8 Make argocd metrics work 2020-08-16 19:25:07 +01:00
c09e471474 Wire up prometheus metrics for argo-cd 2020-08-16 15:49:57 +01:00
6f981eabc0 EBS-CSI version bump, reduce ArgoCD concurrency to reduce load spikes, sync from 180s to 300s 2020-08-15 23:37:45 +01:00
7be12de4e8 Add cert-manager state handling for argo 2020-08-15 15:59:57 +01:00
56ef55ef7a Exlude stateuful service objects to prevent double scrapes 2020-08-15 14:49:30 +01:00
bef01e96ab Exlude stateuful service objects to prevent double scrapes 2020-08-15 14:45:43 +01:00
dd8337660f Add label for servicemonitor 2020-08-15 14:33:41 +01:00
58658bbc01 More logging fixes 2020-08-15 14:25:07 +01:00
4d5a6b72d1 Fix optional prometheus support 2020-08-15 13:24:14 +01:00
943a2080b7 More logging fixes 2020-08-14 23:02:30 +01:00
bbab7de883 Adjust scrape internal for kiam to match others, and servicemonitor for agents 2020-08-14 22:31:34 +01:00
770222bcd7 Add proper label for kiam servicemonitors 2020-08-14 17:39:05 +01:00
aa9dbe455f Add elastic-system ns to kubezero 2020-08-14 17:12:06 +01:00
30d69401b1 Bugfix for prometheus service for calico 2020-08-14 17:10:25 +01:00
afe2e4a34c Bugfix release for Calico, README updates 2020-08-14 17:05:25 +01:00
64dbb4e4a6 More logging fixes... ready for first trial 2020-08-14 15:52:10 +01:00
8c1f45cae1 Various logging fixes to get a first version of ES and Kibana running 2020-08-13 19:44:50 +01:00
8880b983ac Add rabbitmq ingress gateway def 2020-08-11 15:09:48 +01:00
9359ee62c0 Add logging as default ns to look for servicemonitors 2020-08-10 13:53:41 +01:00
74599ddf1b Make sure nodeselector is a string 2020-08-10 13:28:45 +01:00
665fc68f7e make nodeselector for private ingress configurable 2020-08-10 13:20:36 +01:00
5d9f2a5226 Version bump of aws-ebs csi driver to 0.6.0 2020-08-10 12:33:53 +01:00
ede6d6513f Update all charts to use latest lib 2020-08-07 17:02:22 +01:00
706b23d547 New istio naming schema for virtualservices 2020-08-06 19:07:06 +01:00
24ebdf360f Various deps updates, Istio to 1.6.7 2020-08-06 18:43:59 +01:00
2b75664215 Extend shared library for naming functions 2020-08-06 17:21:27 +00:00
5e17b545a9 Add default labels 2020-08-06 17:15:32 +00:00
f32cca216b Add latest docs 2020-08-06 12:38:40 +01:00
2a6449a0b2 Add optional istio ingress policies to metrics 2020-08-06 11:34:32 +00:00
5a46bc784f Add custom prometehus-operator settings 2020-08-06 11:52:16 +01:00
396c16d6ad Controller and scheduler use self-signed certs 2020-08-05 15:58:37 +01:00
c5e38dcc83 Add cert-manager backup support in bootstrap, enable schedule and controller metrics 2020-08-05 15:42:15 +01:00
167c10d957 ArgoCd version bump 2020-08-05 13:29:50 +01:00
c877d9c470 Finally fix go templating awkwardness 2020-08-05 01:05:05 +01:00
c64ef24b0c Volume features need more testing <1.17 2020-08-05 01:00:11 +01:00
e4fa7d57a4 Fix Go template specialness 2020-08-05 00:56:31 +01:00
a600591b28 Enable Volumesnapshot/resize, disable leader election of single instance 2020-08-05 00:50:26 +01:00
9dab68e0d3 Disable kubelet cadadvisor metrics 2020-08-04 14:45:42 +01:00
14be15423a Enable kube_proxy metrics as a trial 2020-08-04 10:08:10 +01:00
199f734f75 Add node_exporter relabel for adapter 2020-08-04 01:38:26 +01:00
5b8ea0e5cd Adapter config from kube-prometheus 2020-08-03 22:19:16 +01:00
bfbb478006 Adapter config from kube-prometheus 2020-08-03 22:15:05 +01:00
9503aa7a9b Disable default rules for the adapter 2020-08-03 21:29:24 +01:00
c0587f6fdf Temp add custom prometheus url 2020-08-03 20:56:52 +01:00
b982254fe0 Temp add custom prometheus url 2020-08-03 20:52:57 +01:00
1f6aaf308f Fix scopt for adapter rules 2020-08-03 18:53:18 +01:00
56d20b0683 Try default settings fro adapter 2020-08-03 18:47:11 +01:00
f9055d49fa Disable unreachable metrics for now 2020-08-03 18:30:33 +01:00
e8b0428e41 Add istio for prometheus 2020-08-03 17:44:58 +01:00
03506a40c9 Frist mostly working version 2020-08-03 17:15:12 +01:00
d825f9f7b8 Add all the rules minus alertmanager 2020-08-03 17:01:39 +01:00
6461fc7036 Config fixes, svc name fix 2020-08-03 16:34:57 +01:00
3b36e4939f Add istio support for metrics grafana 2020-08-03 16:24:32 +01:00
a32698e993 Add Grafana 2020-08-03 16:08:16 +01:00
1419deb729 Fix scope of prometheus options 2020-08-03 15:51:44 +01:00
94e5799ba4 Revert to default Prometheus version 2020-08-03 15:43:56 +01:00
2b325c77a3 Enable operator on release ns 2020-08-03 13:57:14 +01:00
72e831c028 Enable operator on release ns 2020-08-03 13:50:32 +01:00
1948ed7094 Add basic Prometheus itself 2020-08-03 13:26:00 +01:00
3a4d0a6a90 Enable node_exporter 2020-08-03 13:16:48 +01:00
8f31607377 Latest deploy, add prometheus adapter to metrics 2020-08-03 13:06:07 +01:00
87f1a3c8d3 Disable webhooks for now, latest operator 2020-07-31 01:32:44 +01:00
a6bb7e2425 Add webhooks and set tolerations 2020-07-31 01:18:07 +01:00
9deafa7f3e Let Argo take care of CRDs 2020-07-30 18:56:46 +01:00
e4fb576a55 Add montoring and logging NS to kubezero argo project 2020-07-30 18:27:43 +01:00
987ad6aef0 Move metrics to monitoring NS 2020-07-30 18:26:11 +01:00
676273f7e2 Add draft metrics chart 2020-07-30 18:18:32 +01:00
8f5ba87b9a Initial metrics chart 2020-07-30 17:19:48 +01:00
62013253f8 minor bootstrap fix 2020-07-30 17:19:04 +01:00
b6775e1ef5 Convert argo-cd ACL to DENY policy 2020-07-29 18:02:18 +01:00
826d1ff187 Apparently no patch levels in requirements 2020-07-29 15:12:06 +01:00
a6cc459c46 More cleanup, kiam doc update 2020-07-29 15:07:41 +01:00
2b5103c6ee Calico cleanup, add efs-csi 2020-07-29 14:46:55 +01:00
bbc60e778f Tweaks for aws-ebs-csi-driver, added initial aws-efs-csi-driver 2020-07-24 15:40:24 +01:00
47809b452f Remove duplicate CRD 2020-07-24 12:31:22 +01:00
b75bbbfa34 Helm bugfixes 2020-07-24 12:24:21 +01:00