Commit Graph

420 Commits

Author SHA1 Message Date
Stefan Reimer 1ac2eddcea Add alertmanager istio config for metrics, metrics values reorg 2020-12-02 03:53:19 -08:00
Stefan Reimer 8b048dd390 More fixes and upgrade docs 2020-12-01 07:46:04 -08:00
Stefan Reimer 3497392c39 ArgoCd naming fixes 2020-11-30 09:30:06 -08:00
Stefan Reimer a23282bdf5 More fixes 2020-11-30 04:13:52 -08:00
Stefan Reimer 85e89f768c cert-manager version bump, local-path-provisioner fixes 2020-11-30 11:34:44 +00:00
Stefan Reimer 4bca9bd869 Add local-path-provisioner, re-org bootstrap 2020-11-30 01:52:11 -08:00
Stefan Reimer 91c59e3560 Metrics update 2020-11-28 23:54:40 +00:00
Stefan Reimer bc7f4b08ed More bugfixes, ingress certs 2020-11-28 15:01:20 -08:00
Stefan Reimer 7e1d26aa5c More fixes 2020-11-27 08:19:44 -08:00
Stefan Reimer 7df88c8883 Add missing .helmignore 2020-11-26 15:31:40 -08:00
Stefan Reimer ca2d2763d3 Latest fixes, fluent-bit version bump 2020-11-26 09:37:10 -08:00
Stefan Reimer 74e07acf13 More fixes now adding ArgoCD 2020-11-26 05:21:10 -08:00
Stefan Reimer 8b4a2bd920 Another argo tweak 2020-11-24 07:29:38 -08:00
Stefan Reimer 486ea0fa56 Bug fixes and argo tweaks 2020-11-24 07:18:14 -08:00
Stefan Reimer 0a1cb7a07a Revert Kube version check to make argo work 2020-11-24 06:51:48 -08:00
Stefan Reimer 33bf724618 First try adding argoCD day 2 2020-11-24 06:44:57 -08:00
Stefan Reimer f711655c58 Update of various components, new aroless bootstrap working 2020-11-21 04:24:57 -08:00
Stefan Reimer 9d0e2f00a9 First steps of argoless bootstrap 2020-11-03 12:51:57 +00:00
Stefan Reimer 073916903c Minor version bump for prometheus-stack, remove default CPU limit 2020-10-27 14:13:52 +00:00
Stefan Reimer 7c945fbac7 Update docs, bump argo-cd parallel jobs 2020-10-27 11:54:44 +00:00
Stefan Reimer 10e59e67e1 Remove argocd from control plane 2020-10-21 14:18:02 +01:00
Stefan Reimer 6ca8df71ab Enable json logs for argo-cd finally 2020-10-21 13:29:49 +01:00
Stefan Reimer cf00ff3fd7 Bump argo-cd chart version 2020-10-21 13:14:23 +01:00
Stefan Reimer 853edcb141 Bump argo-cd version 2020-10-21 13:12:23 +01:00
Stefan Reimer 31bcd30c41 Revert more prometheus-adapter config 2020-10-21 13:05:08 +01:00
Stefan Reimer c486223699 Revert prometheus adapter changes 2020-10-21 12:51:15 +01:00
Stefan Reimer 21a9816dea More EFS fixes, cert-manager version bump 2020-10-21 04:37:33 -07:00
Stefan Reimer 46cf90068f Adjust prometheus URLs 2020-10-09 18:41:43 -07:00
Stefan Reimer 7dbc97bcbc First stab at new prometheus charts 2020-10-09 17:58:44 -07:00
Stefan Reimer e5cb34c6af Cleanup 2020-10-09 12:38:20 -07:00
Stefan Reimer a903369121 Minor tweak to aws efs upate tooling 2020-10-09 11:15:19 -07:00
Stefan Reimer d925bfb3d5 Actually update the default version of aws ebs to 0.7.0 2020-10-09 11:14:51 -07:00
Stefan Reimer 98f592cb99 AWS EBS driver version bump 2020-10-09 10:53:32 -07:00
Stefan Reimer ee5678b0eb Revert minimal kube version due to issues with argocd 2020-10-09 07:43:05 -07:00
Stefan Reimer 8ae67df9d2 Add multi PV support to EFS 2020-10-09 07:30:25 -07:00
Stefan Reimer c2a6452f27 Update EFS tooling to track releases 2020-10-08 07:52:34 -07:00
Stefan Reimer 6975e79fec Typo 2020-10-07 09:11:22 -07:00
Stefan Reimer 9b4d49575b New Lua function to nest entries into kube.<namespace>.* 2020-10-07 09:09:24 -07:00
Stefan Reimer 694d1b79ca fluent-bit tag improvements 2020-10-05 17:27:58 -07:00
Stefan Reimer e32d258986 Add some spaces 2020-10-05 09:03:47 -07:00
Stefan Reimer bddafc142e More logging fixes, try to decode json at the source 2020-10-05 09:01:50 -07:00
Stefan Reimer 9252d3005a Disable json logging, crashed Argo 2020-10-05 08:43:18 -07:00
Stefan Reimer ca18407f62 Revert ArgoCd 1.7.7 2020-10-05 08:27:37 -07:00
Stefan Reimer 2896c03e2a Latest argocd 2020-10-05 04:31:00 -07:00
Stefan Reimer 19769f97d4 Derp 2020-10-05 04:09:03 -07:00
Stefan Reimer 1429694e43 Updated helm-docs, fluentd SSL handled by Istio, ES&Istio tuning 2020-10-05 03:50:23 -07:00
Stefan Reimer 3f054a96ec Disable borken json parsing for now 2020-10-02 14:46:07 -07:00
Stefan Reimer a2ba9fa085 Disable borken json parsing for now 2020-10-02 14:41:40 -07:00
Stefan Reimer f4068455de Fix the warning due to double CRDs 2020-10-02 10:44:15 -07:00
Stefan Reimer 5904fcedf7 Istio version bump, make http10 support optional, enable redis,mysql protocol support 2020-10-02 10:38:09 -07:00
Stefan Reimer 2ee31f60e2 Minor fluent-bit tuning 2020-10-01 12:32:21 -07:00
Stefan Reimer c6ae3d2461 Fluentd tuning 2020-10-01 10:14:04 -07:00
Stefan Reimer 21c6b0ea58 Fluentd tuning 2020-10-01 10:11:48 -07:00
Stefan Reimer 6003765dc9 Disable pipeline still cpu issues 2020-09-28 04:54:47 -07:00
Stefan Reimer 0b50dbcfbe Reenable fluentd ingest pipeline again 2020-09-28 04:45:39 -07:00
Stefan Reimer a5952f850d Make the kiam annotate namespace job optional 2020-09-18 16:18:59 +01:00
Stefan Reimer 4a918f6d83 Logging fixes for NOT using nameoverride 2020-09-18 16:12:52 +01:00
Stefan Reimer f753a1fc71 Slightly allow ArgoCD a bit more processing 2020-09-18 14:21:39 +01:00
Stefan Reimer 85837c1666 Bump argocd to 1.7.5 as 1.7.4 has a deadlock CPU issue 2020-09-18 13:09:18 +01:00
Stefan Reimer b4c2195eef Add EnvoyFilter to enable tcp keepalive for all Ingress Envoys 2020-09-17 22:25:09 +01:00
Stefan Reimer 182ae141a0 Revert TCP keepalive for fluentd listener 2020-09-17 19:44:34 +01:00
Stefan Reimer dd9e465ead Enable TCP keepalive for fluentd listener 2020-09-17 19:24:24 +01:00
Stefan Reimer 47455bf4f0 TCP keepalive tuning for Istio 2020-09-17 17:54:57 +01:00
Stefan Reimer d3c8c92f9a Set global meshpolicy to prevent upgrade to http2 by default 2020-09-16 16:50:48 +01:00
Stefan Reimer ddb51294c9 Another argocd resource tweak 2020-09-15 11:48:07 +01:00
Stefan Reimer 16bc828a0d Introduce resources for at least the argocd controller 2020-09-15 11:15:55 +01:00
Stefan Reimer 900863acae Docs update 2020-09-14 17:26:39 +01:00
Stefan Reimer ce5290591f fluent-bit version bump 2020-09-14 17:26:19 +01:00
Stefan Reimer 09d29f2704 New bootstrap flow 2020-09-14 16:06:53 +01:00
Stefan Reimer 3a97bbed31 Latest deploy bootstrap tweaks 2020-09-14 15:24:40 +01:00
Stefan Reimer c347c56764 Disable default poddisruptionbudgets, replace with individual todo 2020-09-11 18:21:00 +01:00
Stefan Reimer 4a405a0cbc Still double CRDs 2020-09-11 16:03:22 +01:00
Stefan Reimer fb5229613d Istio is really picky 2020-09-11 16:01:15 +01:00
Stefan Reimer f7ba0ffa33 Move ports > 1024 as we run non-root 2020-09-11 15:45:04 +01:00
Stefan Reimer 530934e603 Set JSON for access logs 2020-09-11 15:39:47 +01:00
Stefan Reimer aa664bec01 Remove xp settings 2020-09-11 15:32:14 +01:00
Stefan Reimer 7a93b34331 Re-enable JSON access logs 2020-09-11 15:28:51 +01:00
Stefan Reimer 33339dbe21 Re-enable access logs 2020-09-11 15:22:34 +01:00
Stefan Reimer 05d9e25f8d Remove deprecated fields for 1.7 2020-09-11 15:20:51 +01:00
Stefan Reimer 6f60ec1dd9 Remove deprecated fields for 1.7 2020-09-11 15:18:30 +01:00
Stefan Reimer e9c0d35695 Remove deprecated fields for 1.7 2020-09-11 15:15:53 +01:00
Stefan Reimer 203f236e23 Version bump Istio to 1.7.1 2020-09-11 15:06:38 +01:00
Stefan Reimer eba052f2f6 Remove double CRD for Istio 2020-09-11 14:42:25 +01:00
Stefan Reimer a09327f3f0 more istio cleanup 2020-09-11 12:37:22 +01:00
Stefan Reimer 5c64544dcb more istio cleanup 2020-09-11 12:32:46 +01:00
Stefan Reimer 72a2a40e81 more istio cleanup 2020-09-11 12:23:08 +01:00
Stefan Reimer 2f7693388e Minot istio tweaks 2020-09-11 12:08:58 +01:00
Stefan Reimer d13fc9d519 Fix math in resources calc 2020-09-11 11:07:49 +01:00
Stefan Reimer e56d0661d6 Make ES heap configurable, set resources accordingly 2020-09-11 11:00:51 +01:00
Stefan Reimer 4cea722fd4 Istio version bump to 1.6.9 2020-09-10 16:44:49 +01:00
Stefan Reimer 790badc1cc Add resources to Kiam 2020-09-10 14:22:47 +01:00
Stefan Reimer f99cb5b21b Another prometheus resources tweak to prevent being killed during restarts 2020-09-10 14:09:23 +01:00
Stefan Reimer db5e587070 Adjust and limit Prometheus resources 2020-09-10 14:01:28 +01:00
Stefan Reimer 71de050f9e ArgoCD version bump to 1.7.4 2020-09-10 13:44:48 +01:00
Stefan Reimer c9b830f727 Change log tag for audit log to not collide with regular tags 2020-09-09 20:59:03 +01:00
Stefan Reimer 122cf5bd52 Calico version bump to 3.16.1 2020-09-09 14:17:02 +01:00
Stefan Reimer 9e043a6241 Dont remove other fields for valid json 2020-09-08 15:41:20 +01:00
Stefan Reimer da503ab38c Fix fluentd parsing of json 2020-09-08 15:34:16 +01:00
Stefan Reimer a1af1a2753 Fix fluentd typo 2020-09-08 15:07:17 +01:00
Stefan Reimer 3b438711dc Update fluentd to latest quay.io image, add json parser for message 2020-09-08 15:05:31 +01:00
Stefan Reimer b7feeae83c Remove CRD property to fix OutofSync Argo 2020-09-08 13:44:31 +01:00
Stefan Reimer 9e0e819fd6 Handle empty message events 2020-09-08 13:40:09 +01:00
Stefan Reimer e09935a819 Add Lua functions to reassemble partial cri-o logs 2020-09-08 13:12:21 +01:00
Stefan Reimer 6b1b02a743 Fluent-bit version bump and support for api audit logs 2020-09-08 12:40:28 +01:00
Stefan Reimer 63537919a4 Move scrape username to its own secret as eck operator cleans up otherwise 2020-09-04 01:13:39 +01:00
Stefan Reimer 3fb65140af Enabled scraping etcd 2020-09-02 15:05:57 +01:00
Stefan Reimer 42b792bb4b More fluentd tuning 2020-08-27 01:13:34 +01:00
Stefan Reimer e2d560c881 Disable ingest pipeline until we know what breaks / jams in ES 2020-08-27 01:03:35 +01:00
Stefan Reimer 7f540d57db Revert ES fixes as servicemonitor is retarted 2020-08-26 23:02:47 +01:00
Stefan Reimer ea3432445e Hardcode es user for now 2020-08-26 22:50:51 +01:00
Stefan Reimer f9821762f7 fluentd / ES fixes 2020-08-26 18:13:21 +01:00
Stefan Reimer c78e9c04ce Fix default value 2020-08-25 14:46:22 +01:00
Stefan Reimer 74abf0fbb3 Make Istio Ingress hosts specific matching the cert 2020-08-25 14:45:56 +01:00
Stefan Reimer 31aa92a971 Revert default fluentd image, latest has issues 2020-08-24 11:38:47 +01:00
Stefan Reimer e4c478ed19 Increase default read-timeout for fluentd 2020-08-23 17:47:28 +01:00
Stefan Reimer 8b5d9ad785 Use quay.io fluentd-es image until we roll our own 2020-08-23 17:41:37 +01:00
Stefan Reimer 80867bd1c2 Fix default fluentd hostname 2020-08-23 15:50:14 +01:00
Stefan Reimer 2d58d73798 Remove Cri parser as it is already incl. upstream now 2020-08-22 19:24:58 +01:00
Stefan Reimer 93edcec5a2 Update docs 2020-08-22 18:27:31 +01:00
Stefan Reimer be346b592f Add fluent-bit support to kuberzero-logging, istio fixes 2020-08-22 18:27:18 +01:00
Stefan Reimer 47fa523694 Refactor argo apps factory 2020-08-21 20:39:55 +01:00
Stefan Reimer 3cfa3512e6 Switch istio ingress to http healthchecks, more tuning 2020-08-21 14:17:47 +01:00
Stefan Reimer 5dac264e17 Also apply improved healthcehck handling and draining to public ingress 2020-08-20 18:32:01 +01:00
Stefan Reimer bdc9687bc3 Apply graceful shutdown fixes interim like Contour 2020-08-20 17:38:18 +01:00
Stefan Reimer 89d765dc53 Add graceful shutdown to Ingress gateway, might need istio 1.7 to actually work though 2020-08-20 16:55:47 +01:00
Stefan Reimer 225526869e Set Istio idle timeout to 1h 2020-08-20 16:12:41 +01:00
Stefan Reimer c5e0187475 Set Istio idle timeout 2020-08-20 15:55:49 +01:00
Stefan Reimer d49ff51379 Disable default syncPolicy, use values instead 2020-08-20 11:40:08 +01:00
Stefan Reimer e782303703 Revert to default images 2020-08-18 13:13:30 +01:00
Stefan Reimer fbc203a2c9 Fix istio to service mapping 2020-08-18 12:45:15 +01:00
Stefan Reimer 279dde5ee2 Revert ot quay image, disable plugins 2020-08-18 12:36:56 +01:00
Stefan Reimer 4a6cbfbbcf Disable persistence by default 2020-08-18 12:08:49 +01:00
Stefan Reimer e6e0aa103b Add missing fluentd secrets 2020-08-18 11:58:37 +01:00
Stefan Reimer 777a0d7f94 Disable statefulset for fluentd being broken upstream 2020-08-18 11:41:09 +01:00
Stefan Reimer 12abcacdd9 Add fluentd to logging 2020-08-18 11:34:34 +01:00
Stefan Reimer fba3e8bfc4 Make old ECK resources optional 2020-08-17 13:12:07 +01:00
Stefan Reimer 1398484af8 Make argocd metrics work 2020-08-16 19:25:07 +01:00
Stefan Reimer 0db65bd060 Wire up prometheus metrics for argo-cd 2020-08-16 15:49:57 +01:00
Stefan Reimer e8afc6ddbb EBS-CSI version bump, reduce ArgoCD concurrency to reduce load spikes, sync from 180s to 300s 2020-08-15 23:37:45 +01:00
Stefan Reimer 4b734dc1bc Add cert-manager state handling for argo 2020-08-15 15:59:57 +01:00
Stefan Reimer 03bab16aa8 Exlude stateuful service objects to prevent double scrapes 2020-08-15 14:49:30 +01:00
Stefan Reimer aac2e235f8 Exlude stateuful service objects to prevent double scrapes 2020-08-15 14:45:43 +01:00
Stefan Reimer a6eab7d24b Add label for servicemonitor 2020-08-15 14:33:41 +01:00
Stefan Reimer 1ae1aac294 More logging fixes 2020-08-15 14:25:07 +01:00
Stefan Reimer 5595fff159 Fix optional prometheus support 2020-08-15 13:24:14 +01:00
Stefan Reimer 0e2e8502ed More logging fixes 2020-08-14 23:02:30 +01:00
Stefan Reimer 15605d0cef Adjust scrape internal for kiam to match others, and servicemonitor for agents 2020-08-14 22:31:34 +01:00
Stefan Reimer 55b0f02394 Add proper label for kiam servicemonitors 2020-08-14 17:39:05 +01:00
Stefan Reimer a9cdc7109e Add elastic-system ns to kubezero 2020-08-14 17:12:06 +01:00
Stefan Reimer 30f6432e59 Bugfix for prometheus service for calico 2020-08-14 17:10:25 +01:00