Commit Graph

415 Commits

Author SHA1 Message Date
Joshua Boniface 8bf3bbdeb1 Enable Prometheus exporter on nodes by default 2023-12-10 00:18:44 -05:00
Joshua Boniface 9e21aecf97 Ignore errors then check for PVC package
This helps work around apt issues when running from the oneshot
update-pvc-daemons playbook. On a new install, this will be OK. On an
upgrade, the apt tasks will fail OK but then the verification that
pvc-client-cli will ensure that things are actually sane before
proceeding.
2023-12-09 23:56:34 -05:00
Joshua Boniface 865c7d0872 Add Ceph Prometheus configurations (0.9.84) 2023-12-09 23:51:29 -05:00
Joshua Boniface 18054c01a0 Remove obsolete config templates 2023-12-09 23:05:16 -05:00
Joshua Boniface 5111ae47c4 Lower default monitoring interval to 15s
Faults are also reported on the monitoring interval, so 60s seems like
too long. Lower this to 15 seconds by default instead.
2023-12-01 16:06:24 -05:00
Joshua Boniface 73ad2a7751 Avoid removal of old versions at all
We simply shouldn't do this here. Let's leave them hanging around unless
removed in another way (e.g. in update-pvc-daemons)
2023-12-01 02:08:22 -05:00
Joshua Boniface 97b1469a70 Set ownership of pvc.conf 2023-12-01 01:57:56 -05:00
Joshua Boniface d59c9ce1ea Add safety to removal of legacy configs
This conditional will ensure that, the first time pvc.conf is installed
(or, subsequent times, until it stabilizes), the legacy configs will not
be removed. Then, on the next run in which pvc.conf does not change,
they will be removed.

This should provide a safety valve during a 0.9.83 update with the
update-pvc-daemons playbook: if the update succeeds, on the next run,
the legacy configs will be purged; otherwise, they will still be present
and can be used for fallback just in case.

This probably isn't needed, but just in case I'd rather be safe.
2023-12-01 01:45:48 -05:00
Joshua Boniface 1cfda69e5e Remove autobackup.yaml and fix quoting 2023-12-01 01:43:14 -05:00
Joshua Boniface 9408bf709c Only install pvcapid on coordinators
There should be no reason for the API to be installed on non-coordinator
hosts, so separate it out.
2023-12-01 01:40:56 -05:00
Joshua Boniface 9d2af41d3f Install new packages and remove old confs 2023-11-30 03:29:24 -05:00
Joshua Boniface 1e89a1440c Enable modelines by default 2023-11-28 16:13:49 -05:00
Joshua Boniface b1d6915cf4 Write new pvc.conf style configuration (0.9.82+) 2023-11-28 16:10:23 -05:00
Joshua Boniface c8764159f6 Readd queue configuration with updated options 2023-11-05 23:37:49 -05:00
Joshua Boniface 523f7da71e Add KeyDB (Redis clone) to configuration
Replaces Redis for PVC >= 0.9.81
2023-11-05 19:24:30 -05:00
Joshua Boniface 2f9603c82f Adjust pvcapid.yaml for 0.9.81 worker queue config 2023-11-04 12:52:21 -04:00
Joshua Boniface e15e2dfaab Remove erroneous netmask from floating IP defaults 2023-11-04 12:51:07 -04:00
Joshua Boniface 103e9fe147 Add restart overrides for ceph-mgr
Needed because ceph-mgr seems to crash frequently under Debian 12 when
adding or removing OSDs. The default settings do not restart it
properly, so this override does.
2023-11-03 14:25:31 -04:00
Joshua Boniface 15a5b581f1 Disable failing socket services 2023-11-03 12:10:19 -04:00
Joshua Boniface 90417621d7 Add autobackup support to pvc-ansible 2023-10-27 02:08:20 -04:00
Joshua Boniface 17f819ea3f Don't set "latest" for libvirt packages
Avoids errors during runs before upgrades.
2023-10-24 10:41:47 -04:00
Joshua Boniface c11f896a60 Fix zk_status check target znode 2023-10-22 00:42:43 -04:00
Joshua Boniface 5764695699 Add AMD microcode as well 2023-10-03 13:36:56 -04:00
Joshua Boniface f4bbdb7c86 Use full path for uuidgen 2023-09-29 03:00:53 -04:00
Joshua Boniface 82accb3b5e Install intel-microcode on Intel CPUs
Required otherwise needrestart fails.
2023-09-20 16:43:08 -04:00
Joshua Boniface 83636388f0 Add configurable monitoring interval 2023-09-15 22:31:16 -04:00
Joshua Boniface e995f3750b Fix incorrect repo name in Bullseye 2023-09-09 19:28:47 -04:00
Joshua Boniface 85253e9706 Enable pass-through IOMMU on Bookworm 2023-09-05 16:35:58 -04:00
Joshua Boniface 6ac6b74023 Update key name 2023-09-05 13:50:37 -04:00
Joshua Boniface 80f5a4f260 Add dpkg-cleanup step to base config 2023-09-05 10:32:40 -04:00
Joshua Boniface 8ebb8a8339 Disable autoscale via command
As per [1] the ceph.conf option does not work properly and must be set this way.

[1] https://stackoverflow.com/questions/63853436/ceph-octopus-setting-autoscale-mode-from-ceph-conf-file
2023-09-02 01:59:47 -04:00
Joshua Boniface a10b3e8d4a Lower default pgs and disable autoscale 2023-09-01 23:54:10 -04:00
Joshua Boniface cf426408f2 Restore original setting 2023-09-01 16:18:20 -04:00
Joshua Boniface 3680717daa Remove extra restarts on bootstrap 2023-09-01 15:42:30 -04:00
Joshua Boniface 1f4cd92d63 Fix bad calls to node primary 2023-09-01 15:42:30 -04:00
Joshua Boniface 6da9956811 Fix delegate_to 2023-09-01 15:42:30 -04:00
Joshua Boniface fb60093750 Ignore errors in Patroni restart handler 2023-09-01 15:42:30 -04:00
Joshua Boniface 7b061966ad Ignore errors in Patroni
Required during upgrades as the service may be masked.
2023-09-01 15:42:30 -04:00
Joshua Boniface 1e497413e8 Remove extra whitespace 2023-09-01 15:42:30 -04:00
Joshua Boniface 64ce09122d Add additional primary node switch 2023-09-01 15:42:30 -04:00
Joshua Boniface 353399a407 Ensure core pg_hba entries are present 2023-09-01 15:42:30 -04:00
Joshua Boniface e754ca84f6 Add one more fact regathering 2023-09-01 15:42:30 -04:00
Joshua Boniface cb2cbdff61 Add zstd dependency for D10+ 2023-09-01 15:42:30 -04:00
Joshua Boniface b21778f117 Fix Patroni upgrade and D12 support 2023-09-01 15:42:30 -04:00
Joshua Boniface 9411679004 Fix reboot 2023-09-01 15:42:30 -04:00
Joshua Boniface 0de0ec7ded Ensure facts are always regathered 2023-09-01 15:42:30 -04:00
Joshua Boniface 7c8b6919fe Add Debian 12 Patroni config 2023-09-01 15:42:30 -04:00
Joshua Boniface 7fc57a69b2 Fix warning in user module 2023-09-01 15:42:30 -04:00
Joshua Boniface 2ba8f1cfc3 Add retries to all apt commands 2023-09-01 15:42:30 -04:00
Joshua Boniface d54844746e Ignore errors enabling vhostmd
Seems to cause issues in bookworm.
2023-09-01 15:42:30 -04:00
Joshua Boniface 1c2bd544b3 Use non-free-firmware repository 2023-09-01 15:42:30 -04:00
Joshua Boniface 71d956dab7 Add final pvcnoded restart 2023-09-01 15:42:30 -04:00
Joshua Boniface 7e09ee7d21 Allow specifying interface mode 2023-09-01 15:42:30 -04:00
Joshua Boniface ed2fe7106e Fix support for bookworm 2023-09-01 15:42:30 -04:00
Joshua Boniface 4bcd7b40a0 Remove extra echo with PVC 0.9.64 2023-09-01 15:42:30 -04:00
Joshua Boniface f79d1da5be Update other commands to use new CLI format 2023-09-01 15:42:30 -04:00
Joshua Boniface 0d3e525f12 Update link to one level higher 2023-09-01 15:42:29 -04:00
Joshua Boniface 94b12794dc Work around SSH key bug 2023-09-01 15:42:29 -04:00
Joshua Boniface 017e1405ed Use debian_version custom fact 2023-09-01 15:42:29 -04:00
Joshua Boniface 08f923d29c Use custom fact for Debian codename 2023-09-01 15:42:29 -04:00
Joshua Boniface 679e15c484 Add *.update-* obsolete configs to dpkg plugin 2023-09-01 15:42:29 -04:00
Joshua Boniface a490924e3a Add traceroute and MTR to PVC package list 2023-09-01 15:42:29 -04:00
Joshua Boniface f8ef2602bc Revert "Fix symlink to be one level up"
This reverts commit 7693b2d78f.
2023-09-01 15:42:29 -04:00
Joshua Boniface dcaa0228b7 Fix symlink to be one level up 2023-09-01 15:42:29 -04:00
Joshua Boniface efeaa61e0f Add customizable NTP servers 2023-09-01 15:42:29 -04:00
Joshua Boniface e9f76042bd Allow specifying alternate channels in IPMI 2023-09-01 15:42:29 -04:00
Joshua Boniface cab4deac26 Add configuration field for plugins 2023-09-01 15:42:29 -04:00
Joshua Boniface 34d12ab423 Add Ceph check 2023-09-01 15:42:29 -04:00
Joshua Boniface c2b576334f Adjust plugin log config field for 0.9.62 2023-09-01 15:42:29 -04:00
Joshua Boniface 84a3f7afa0 Add edac-utils to packages 2023-09-01 15:42:29 -04:00
Joshua Boniface 612045b8b3 Restore original rsyslog-rotate script
Direct call doesn't work because of how arguments are passed to
postrotate.
2023-09-01 15:42:29 -04:00
Joshua Boniface 5cd9566163 Explicitly use systemctl in logrotate
For some reason (Debian bug?) the default rsyslog-rotate script was not
properly rotating rsyslog logfiles. Instead, explicitly call systemctl
kill -s HUP for this, using a full path.
2023-09-01 15:42:29 -04:00
Joshua Boniface 57010260bd Use full debian_version 2023-09-01 15:42:29 -04:00
Joshua Boniface 2a925904e4 Alter format of Debian version in MOTD 2023-09-01 15:42:29 -04:00
Joshua Boniface 561ecb5c61 Adjust name of bootstrap trigger variable
The PVC bootstrap framework overrides this variable and wreaks havoc on
it. Instead adjust our side so that it looks for do_bootstrap instead.
2023-09-01 15:42:29 -04:00
Joshua Boniface a79961605a Replace per-user htoprc with system-wide config
Also update to newer htoprc layout from BLSE.
2023-09-01 15:42:29 -04:00
Joshua Boniface 5a48ec4d79 Ensure CPU tuning is only applied on Debian 11+ 2023-09-01 15:42:29 -04:00
Joshua Boniface 07d75573d6 Add updated tuning configuration
Uses a much nicer CPU tuning configuration, leveraging systemd's
AllowedCPUs and CPUAffinity options within a set of slices (some
default, some custom).

Configuration is also greatly simplified versus the previous
implementation, simply asking for a number of CPUS for both the system
and OSDs, and calculating everything else that is required.

Also switches (back) to the v2 unified cgroup hierarchy by default as
required by the systemd AllowedCPUs directive.
2023-09-01 15:42:29 -04:00
Joshua Boniface fa4f1cff0f Adjust variable used for migrate selector 2023-09-01 15:42:29 -04:00
Joshua Boniface 1d35fec8a8 Remove cpuset configurations
This functionality simply did not work, with Libvirt continuing to dump
its processes into the root cset thus defeating the purpose entirely.

Just remove it, from some very initial testing it isn't worth the
headache.
2023-09-01 15:42:29 -04:00
Joshua Boniface f51fc2ce64 Fix setting of csets for OSDs 2023-09-01 15:42:29 -04:00
Joshua Boniface 8f685116b7 Add Ceph monitor backup 2023-09-01 15:42:29 -04:00
Joshua Boniface c3ce11dacf Fix update-motd so it runs properly 2023-09-01 15:42:29 -04:00
Joshua Boniface 267494d58a Add lm-sensors configuration 2023-09-01 15:42:29 -04:00
Joshua Boniface d94f587e37 Remove obsolete logrotate settings 2023-09-01 15:42:29 -04:00
Joshua Boniface 73e1f2042c Add extra space for clarity 2023-09-01 15:42:29 -04:00
Joshua Boniface 30ddeb0fee Update MOTD automatically on boot
The cron every minute was pointlessly excessive considering this doesn't
actually change minute-to-minute.
2023-09-01 15:42:29 -04:00
Joshua Boniface 86026de8ef Adjust colour scheme of MOTD 2023-09-01 15:42:29 -04:00
Joshua Boniface 8e1d005d43 Obtain more information for MOTD header
Add model and serial numbers to the vendor, and put this on its own
line. Also use BASH for proper syntax formatting. Reformat the header to
be a more compact format.
2023-09-01 15:42:29 -04:00
Joshua Boniface b987c4ea8f Adjust GRUB_DIST and add UEFI regeneration
Keeps the UEFI boot list cleaned and consistent
2023-09-01 15:42:29 -04:00
Joshua Boniface 144f519e76 Add rinse dependency for provisioner 2023-09-01 15:42:29 -04:00
Joshua Boniface be091f66d4 Remove pvc-flush references
This service causes more problems than it solves usually, so it is being
removed in the next PVC version.
2023-09-01 15:42:28 -04:00
Joshua Boniface 08c8be66b3 Increase timeout threshold for freshness 2023-09-01 15:42:28 -04:00
Joshua Boniface 00482aec06 Fix the other instance too 2023-09-01 15:42:28 -04:00
Joshua Boniface da98a4d445 Ignore errors about removing keys 2023-09-01 15:42:28 -04:00
Joshua Boniface 6cf8948107 Add Ceph support for single-node clusters
Ensures that the pool default size/min size is set to something
reasonable for a single node (effective RAID-1) and replace teh default
CRUSH replicate_rule set for this situation with one choosing OSD
instead of host as the default.
2023-09-01 15:42:28 -04:00
Joshua Boniface e4ccafee73 Add cgroup delegation override
Required to solve the occasional
  libvirt: QEMU Driver error : Requested operation is not valid:
  cgroup CPUACCT controller is not mounted
problem, as per:
  https://answers.launchpad.net/ubuntu/+question/665132
2023-09-01 15:42:28 -04:00
Joshua Boniface e8fe165e00 Further optimize ownership agent output 2023-09-01 15:42:28 -04:00
Joshua Boniface cbea6e284c Make ownership check consistent with cmk-agent 2.1
The new CheckMK agent uses UID 998 (dynamic) for itself. This causes
ownership problems with the old logic of this check. Move instead to a
range, where the UIDs from 200-599 are reserved for administrators, and
check for this range explicitly. Also eliminates the exceptions for ceph
and 2000 from previous iterations.
2023-09-01 15:42:28 -04:00
Joshua Boniface 9e20e47903 Update freshness checks 2023-09-01 15:42:28 -04:00