Commit Graph

487 Commits

Author SHA1 Message Date
Joshua Boniface 54bf70d336 Enable Prometheus metrics in Zookeeper too 2023-12-10 00:32:37 -05:00
Joshua Boniface 513313d60f Limit FRR Prom exporter to 12+
Package did not exist on Debian 10/11
2023-12-10 00:31:21 -05:00
Joshua Boniface 35b375ab0e Fix incorrect variable name 2023-12-10 00:26:10 -05:00
Joshua Boniface bf10ede298 Add additional Prometheus exporters 2023-12-10 00:24:19 -05:00
Joshua Boniface 8bf3bbdeb1 Enable Prometheus exporter on nodes by default 2023-12-10 00:18:44 -05:00
Joshua Boniface 010ecefe16 Ensure pvchealthd is restarted as well 2023-12-10 00:13:42 -05:00
Joshua Boniface c07b835e33 Fix bad variable 2023-12-10 00:07:08 -05:00
Joshua Boniface 54c9313668 Force all when updating PVC packages
Avoids the overwrite issue in <0.9.83 to 0.9.83.
2023-12-10 00:04:02 -05:00
Joshua Boniface c488b04939 Ensure new packages are installed as well 2023-12-09 23:59:42 -05:00
Joshua Boniface 9e21aecf97 Ignore errors then check for PVC package
This helps work around apt issues when running from the oneshot
update-pvc-daemons playbook. On a new install, this will be OK. On an
upgrade, the apt tasks will fail OK but then the verification that
pvc-client-cli will ensure that things are actually sane before
proceeding.
2023-12-09 23:56:34 -05:00
Joshua Boniface 865c7d0872 Add Ceph Prometheus configurations (0.9.84) 2023-12-09 23:51:29 -05:00
Joshua Boniface 18054c01a0 Remove obsolete config templates 2023-12-09 23:05:16 -05:00
Joshua Boniface 5111ae47c4 Lower default monitoring interval to 15s
Faults are also reported on the monitoring interval, so 60s seems like
too long. Lower this to 15 seconds by default instead.
2023-12-01 16:06:24 -05:00
Joshua Boniface 82d2f13981 Add legacy config cleanup to playbook 2023-12-01 02:17:42 -05:00
Joshua Boniface 73ad2a7751 Avoid removal of old versions at all
We simply shouldn't do this here. Let's leave them hanging around unless
removed in another way (e.g. in update-pvc-daemons)
2023-12-01 02:08:22 -05:00
Joshua Boniface 97b1469a70 Set ownership of pvc.conf 2023-12-01 01:57:56 -05:00
Joshua Boniface d59c9ce1ea Add safety to removal of legacy configs
This conditional will ensure that, the first time pvc.conf is installed
(or, subsequent times, until it stabilizes), the legacy configs will not
be removed. Then, on the next run in which pvc.conf does not change,
they will be removed.

This should provide a safety valve during a 0.9.83 update with the
update-pvc-daemons playbook: if the update succeeds, on the next run,
the legacy configs will be purged; otherwise, they will still be present
and can be used for fallback just in case.

This probably isn't needed, but just in case I'd rather be safe.
2023-12-01 01:45:48 -05:00
Joshua Boniface 1cfda69e5e Remove autobackup.yaml and fix quoting 2023-12-01 01:43:14 -05:00
Joshua Boniface 9408bf709c Only install pvcapid on coordinators
There should be no reason for the API to be installed on non-coordinator
hosts, so separate it out.
2023-12-01 01:40:56 -05:00
Joshua Boniface 15fc3261de Add PVC role tasks to update-pvc-daemons
Ensures that configurations are always updated whenever the daemons are.
This will be necessary for 0.9.83 with the fundamental change from
pvcXd.yaml to pvc.conf configuration formats, while also ensuring that
future daemon updates also include any configuration changes that may be
pending in the group_vars.
2023-12-01 01:37:39 -05:00
Joshua Boniface 9d2af41d3f Install new packages and remove old confs 2023-11-30 03:29:24 -05:00
Joshua Boniface 1e89a1440c Enable modelines by default 2023-11-28 16:13:49 -05:00
Joshua Boniface b1d6915cf4 Write new pvc.conf style configuration (0.9.82+) 2023-11-28 16:10:23 -05:00
Joshua Boniface 7dbabf76c5 Remove pycache entries on update 2023-11-25 00:51:13 -05:00
Joshua Boniface fef97f0b04 Adjust name of pvcapid-worker to pvcworkerd 2023-11-15 20:32:23 -05:00
Joshua Boniface 8ba0ca02b1 Add SSHFS auto_mount example to group_vars 2023-11-08 12:33:34 -05:00
Joshua Boniface c8764159f6 Readd queue configuration with updated options 2023-11-05 23:37:49 -05:00
Joshua Boniface 523f7da71e Add KeyDB (Redis clone) to configuration
Replaces Redis for PVC >= 0.9.81
2023-11-05 19:24:30 -05:00
Joshua Boniface 2f9603c82f Adjust pvcapid.yaml for 0.9.81 worker queue config 2023-11-04 12:52:21 -04:00
Joshua Boniface e15e2dfaab Remove erroneous netmask from floating IP defaults 2023-11-04 12:51:07 -04:00
Joshua Boniface 103e9fe147 Add restart overrides for ceph-mgr
Needed because ceph-mgr seems to crash frequently under Debian 12 when
adding or removing OSDs. The default settings do not restart it
properly, so this override does.
2023-11-03 14:25:31 -04:00
Joshua Boniface 15a5b581f1 Disable failing socket services 2023-11-03 12:10:19 -04:00
Joshua Boniface 90417621d7 Add autobackup support to pvc-ansible 2023-10-27 02:08:20 -04:00
Joshua Boniface 677287fd2e Add additional wait after stopping OSDs
Allows the Ceph cluster to properly reconcile first.
2023-10-24 10:42:15 -04:00
Joshua Boniface 17f819ea3f Don't set "latest" for libvirt packages
Avoids errors during runs before upgrades.
2023-10-24 10:41:47 -04:00
Joshua Boniface d0bcbf123f Move kernel cleanup to after reboot
Otherwise, modules might fail etc. when the kernel package is purged
before reboot causing odd failures.
2023-10-24 10:41:47 -04:00
Joshua Boniface 7fe682aa60 Handle freshness for all 3 types separately
If microcode was missing, checking the other two would be UNKN and thus
not restart. But, if microcode *is* present, we want to restart for
either of the other two as well.

So separate into 3 distinct checks and restart if any one is changed.
2023-10-24 10:41:47 -04:00
Joshua Boniface c11f896a60 Fix zk_status check target znode 2023-10-22 00:42:43 -04:00
Joshua Boniface 5764695699 Add AMD microcode as well 2023-10-03 13:36:56 -04:00
Joshua Boniface f4bbdb7c86 Use full path for uuidgen 2023-09-29 03:00:53 -04:00
Joshua Boniface c5d572521f Ensure any errors are fatal during deploy 2023-09-21 15:18:34 -04:00
Joshua Boniface 82accb3b5e Install intel-microcode on Intel CPUs
Required otherwise needrestart fails.
2023-09-20 16:43:08 -04:00
Joshua Boniface 6d05f40242 Fix import for newer Ansible versions 2023-09-18 09:42:01 -04:00
Joshua Boniface a6957e9a8a Add default monitoring interval to group_vars 2023-09-15 22:32:02 -04:00
Joshua Boniface 83636388f0 Add configurable monitoring interval 2023-09-15 22:31:16 -04:00
Joshua Boniface e995f3750b Fix incorrect repo name in Bullseye 2023-09-09 19:28:47 -04:00
Joshua Boniface 85253e9706 Enable pass-through IOMMU on Bookworm 2023-09-05 16:35:58 -04:00
Joshua Boniface 6ac6b74023 Update key name 2023-09-05 13:50:37 -04:00
Joshua Boniface 8a901e5326 Add master checkout during update-remote 2023-09-05 13:22:01 -04:00
Joshua Boniface 80f5a4f260 Add dpkg-cleanup step to base config 2023-09-05 10:32:40 -04:00