Commit Graph

348 Commits

Author SHA1 Message Date
Joshua Boniface 34d12ab423 Add Ceph check 2023-09-01 15:42:29 -04:00
Joshua Boniface c2b576334f Adjust plugin log config field for 0.9.62 2023-09-01 15:42:29 -04:00
Joshua Boniface 84a3f7afa0 Add edac-utils to packages 2023-09-01 15:42:29 -04:00
Joshua Boniface 612045b8b3 Restore original rsyslog-rotate script
Direct call doesn't work because of how arguments are passed to
postrotate.
2023-09-01 15:42:29 -04:00
Joshua Boniface 5cd9566163 Explicitly use systemctl in logrotate
For some reason (Debian bug?) the default rsyslog-rotate script was not
properly rotating rsyslog logfiles. Instead, explicitly call systemctl
kill -s HUP for this, using a full path.
2023-09-01 15:42:29 -04:00
Joshua Boniface 57010260bd Use full debian_version 2023-09-01 15:42:29 -04:00
Joshua Boniface 2a925904e4 Alter format of Debian version in MOTD 2023-09-01 15:42:29 -04:00
Joshua Boniface 561ecb5c61 Adjust name of bootstrap trigger variable
The PVC bootstrap framework overrides this variable and wreaks havoc on
it. Instead adjust our side so that it looks for do_bootstrap instead.
2023-09-01 15:42:29 -04:00
Joshua Boniface a79961605a Replace per-user htoprc with system-wide config
Also update to newer htoprc layout from BLSE.
2023-09-01 15:42:29 -04:00
Joshua Boniface 5a48ec4d79 Ensure CPU tuning is only applied on Debian 11+ 2023-09-01 15:42:29 -04:00
Joshua Boniface 07d75573d6 Add updated tuning configuration
Uses a much nicer CPU tuning configuration, leveraging systemd's
AllowedCPUs and CPUAffinity options within a set of slices (some
default, some custom).

Configuration is also greatly simplified versus the previous
implementation, simply asking for a number of CPUS for both the system
and OSDs, and calculating everything else that is required.

Also switches (back) to the v2 unified cgroup hierarchy by default as
required by the systemd AllowedCPUs directive.
2023-09-01 15:42:29 -04:00
Joshua Boniface fa4f1cff0f Adjust variable used for migrate selector 2023-09-01 15:42:29 -04:00
Joshua Boniface 1d35fec8a8 Remove cpuset configurations
This functionality simply did not work, with Libvirt continuing to dump
its processes into the root cset thus defeating the purpose entirely.

Just remove it, from some very initial testing it isn't worth the
headache.
2023-09-01 15:42:29 -04:00
Joshua Boniface f51fc2ce64 Fix setting of csets for OSDs 2023-09-01 15:42:29 -04:00
Joshua Boniface 8f685116b7 Add Ceph monitor backup 2023-09-01 15:42:29 -04:00
Joshua Boniface c3ce11dacf Fix update-motd so it runs properly 2023-09-01 15:42:29 -04:00
Joshua Boniface 267494d58a Add lm-sensors configuration 2023-09-01 15:42:29 -04:00
Joshua Boniface d94f587e37 Remove obsolete logrotate settings 2023-09-01 15:42:29 -04:00
Joshua Boniface 73e1f2042c Add extra space for clarity 2023-09-01 15:42:29 -04:00
Joshua Boniface 30ddeb0fee Update MOTD automatically on boot
The cron every minute was pointlessly excessive considering this doesn't
actually change minute-to-minute.
2023-09-01 15:42:29 -04:00
Joshua Boniface 86026de8ef Adjust colour scheme of MOTD 2023-09-01 15:42:29 -04:00
Joshua Boniface 8e1d005d43 Obtain more information for MOTD header
Add model and serial numbers to the vendor, and put this on its own
line. Also use BASH for proper syntax formatting. Reformat the header to
be a more compact format.
2023-09-01 15:42:29 -04:00
Joshua Boniface b987c4ea8f Adjust GRUB_DIST and add UEFI regeneration
Keeps the UEFI boot list cleaned and consistent
2023-09-01 15:42:29 -04:00
Joshua Boniface 144f519e76 Add rinse dependency for provisioner 2023-09-01 15:42:29 -04:00
Joshua Boniface be091f66d4 Remove pvc-flush references
This service causes more problems than it solves usually, so it is being
removed in the next PVC version.
2023-09-01 15:42:28 -04:00
Joshua Boniface 08c8be66b3 Increase timeout threshold for freshness 2023-09-01 15:42:28 -04:00
Joshua Boniface 00482aec06 Fix the other instance too 2023-09-01 15:42:28 -04:00
Joshua Boniface da98a4d445 Ignore errors about removing keys 2023-09-01 15:42:28 -04:00
Joshua Boniface 6cf8948107 Add Ceph support for single-node clusters
Ensures that the pool default size/min size is set to something
reasonable for a single node (effective RAID-1) and replace teh default
CRUSH replicate_rule set for this situation with one choosing OSD
instead of host as the default.
2023-09-01 15:42:28 -04:00
Joshua Boniface e4ccafee73 Add cgroup delegation override
Required to solve the occasional
  libvirt: QEMU Driver error : Requested operation is not valid:
  cgroup CPUACCT controller is not mounted
problem, as per:
  https://answers.launchpad.net/ubuntu/+question/665132
2023-09-01 15:42:28 -04:00
Joshua Boniface e8fe165e00 Further optimize ownership agent output 2023-09-01 15:42:28 -04:00
Joshua Boniface cbea6e284c Make ownership check consistent with cmk-agent 2.1
The new CheckMK agent uses UID 998 (dynamic) for itself. This causes
ownership problems with the old logic of this check. Move instead to a
range, where the UIDs from 200-599 are reserved for administrators, and
check for this range explicitly. Also eliminates the exceptions for ceph
and 2000 from previous iterations.
2023-09-01 15:42:28 -04:00
Joshua Boniface 9e20e47903 Update freshness checks 2023-09-01 15:42:28 -04:00
Joshua Boniface d47d320bb3 Replace freshness and kernel_version checks
Use an updated plugin from BLSE that uses needrestart instead of manual
parsing of these elements.
2023-09-01 15:42:28 -04:00
Joshua Boniface ea9fe5570f Add method to remove inactive SSH keys 2023-09-01 15:42:28 -04:00
Joshua Boniface 25dde4709b Ensure packages are installed as newhost 2023-09-01 15:42:28 -04:00
Joshua Boniface 4dfd877c7f Ensure Admin users are in additional groups 2023-09-01 15:42:28 -04:00
Joshua Boniface ce9304e43e Populate /etc/timezone as well 2023-09-01 15:42:28 -04:00
Joshua Boniface 9fe43efac2 Convert default libvirtd to template 2023-09-01 15:42:28 -04:00
Joshua Boniface aa6b4ac3dc Make locale generation universal
Don't rely on a notify/handler, just do it every time in the base role.
2023-09-01 15:42:28 -04:00
Joshua Boniface 91ca3d1510 Ensure insecure_global_id_reclaim is false 2023-09-01 15:42:28 -04:00
Joshua Boniface 3397dacab4 Fix bugs with Patroni bootstrap 2023-09-01 15:42:28 -04:00
Joshua Boniface 1838f8ff56 Add proper PostgreSQL versioning 2023-09-01 15:42:28 -04:00
Joshua Boniface 773fd5a9d4 Ensure all zkCli has -server set 2023-09-01 15:42:28 -04:00
Joshua Boniface 0e9d0b3294 Fix incorrect postgresql version 2023-09-01 15:42:28 -04:00
Joshua Boniface 35dcf979f4 Customize grub distributor 2023-09-01 15:42:28 -04:00
Joshua Boniface ba81a106d2 Set postfix to listen on all interfaces
Binding to just localhost was causing problems.
2023-09-01 15:42:28 -04:00
Joshua Boniface a87745d640 Fix name of task 2023-09-01 15:42:28 -04:00
Joshua Boniface d6cb28b639 Add immutability to PVC subrole
1. Remove the obsolete pvc-vacuum script install.

2. Remove notifies when modifying configs; we do not want to restart the
daemons uncontrolled.

3. Add bootstrap check to package installs so they only happen on
bootstrap.

This ensures this part of the role, on re-runs, will *only* update
configs and not actually touch the running daemon. This makes it safe to
run before a oneshot/update-pvc-daemons.yml playbook run.
2023-09-01 15:42:28 -04:00
Joshua Boniface 77be96bf6f Fix a few more splits 2023-09-01 15:42:28 -04:00
Joshua Boniface 95b47f8b09 Fix a few more extraneous splits
Just use this_node if applicable, or the raw node.hostname.
2023-09-01 15:42:28 -04:00
Joshua Boniface 87803cb7a2 Remove extraneous splits
The node.hostname should always be short.
2023-09-01 15:42:28 -04:00
Joshua Boniface d24cb8a8ef Unify and standardize inventory_hostname
This was causing some confusing conflicts, so create a new fact called
"this_node" which is inventory_hostname.split('.')[0], i.e. the short
name, and use that everywhere instead of an FQDN or true inventory
hostname.
2023-09-01 15:42:28 -04:00
Joshua Boniface 056c325486 Add option for setting CPU governor
Allows the administrator to set a CPU frequency governor if they need
to, though the default of ondemand is usually sufficient.
2023-09-01 15:42:28 -04:00
Joshua Boniface fc5bcf139c Fix name of IPMI check again 2023-09-01 15:42:28 -04:00
Joshua Boniface 44cedf66c9 Fix name of ipmi check 2023-09-01 15:42:28 -04:00
Joshua Boniface 9f7dbfb4f8 Add IPMI check to tasks 2023-09-01 15:42:28 -04:00
Joshua Boniface b9ae4d1009 Adjust headers and add LOM check 2023-09-01 15:42:27 -04:00
Joshua Boniface 48fb21af75 Add node list to PVC MOTD 2023-09-01 15:42:27 -04:00
Joshua Boniface e009cf4076 Fix whitespaced manufacturer and bad [[ 2023-09-01 15:42:27 -04:00
Joshua Boniface e65f1d15a6 Add coordinator state to MOTD 2023-09-01 15:42:27 -04:00
Joshua Boniface 894ce9b517 Support unknown manufacturers in MOTD 2023-09-01 15:42:27 -04:00
Joshua Boniface 55ec177919 Ignore errors restarting libvirtd
This seems to inexplicably fail sometimes. We can just ignore it.
2023-09-01 15:42:27 -04:00
Joshua Boniface b814ec60f6 Add resolv.conf customization 2023-09-01 15:42:27 -04:00
Joshua Boniface ddecb94348 Disable unified cgroup heirarchy on kernel cmdline
This is required on Debian 11 to use the cset tool, since the newer
systemd implementation of a unified cgroup hierarchy is not compatible
with the cset tool.

Ref for future use:
  https://github.com/lpechacek/cpuset/issues/40
2023-09-01 15:42:27 -04:00
Joshua Boniface be3ce67574 Use inventory_hostname in IPMI fragment 2023-09-01 15:42:27 -04:00
Joshua Boniface 5f05835721 Update bondX configuration 2023-09-01 15:42:27 -04:00
Joshua Boniface 4cb2d7835c Add setting bridge_mtu to config 2023-09-01 15:42:27 -04:00
Joshua Boniface 9f16995f59 Add smartmontools to base package list 2023-09-01 15:42:27 -04:00
Joshua Boniface 6e2d661134 Adjust documentation and behaviour of cpuset
1. Detail the caveats and specific situations and ref the documentation
which will provide more details.

2. Always install the configs, but use /etc/default/ceph-osd-cpuset to
control if the script does anything or not (so, the "osd" cset set is
always active just not set in a special way.
2023-09-01 15:42:27 -04:00
Joshua Boniface 83bd1b1efd Install cset configs even if disabled
The setup script handles this instead.
2023-09-01 15:42:27 -04:00
Joshua Boniface 7927ec4f11 Allow dynamic enabling/disabling of cset
Add a separate config to handle enable/disable on the system itself.
2023-09-01 15:42:27 -04:00
Joshua Boniface 2ae9b9075a Adjust default ceph.conf parameters
1. Remove an explicit OSD journal size, especially such a small one (no
clue why I ever added that...)

2. Add max scrubs, disable scrub during recovery, and set scrub sleep.

3. Add max backfills, tune recovery sleep to 0 to prioritize recovery.
2023-09-01 15:42:27 -04:00
Joshua Boniface 6e48d6fe84 Add Ceph OSD cpuset tuning options
Allows an administrator to set CPU pinning with the cpuset tool for Ceph
OSDs, in situations where CPU contention with VMs or other system tasks
may be negatively affecting OSD performance. This is optional, advanced
tuning and is disabled by default.
2023-09-01 15:42:27 -04:00
Joshua Boniface 45424a28ce Fix bad flag 2023-09-01 15:42:27 -04:00
Joshua Boniface 044a14fa6d Add package installs for different Debian versions 2023-09-01 15:42:27 -04:00
Joshua Boniface ae40227ea1 Move paths and keys to defaults 2023-09-01 15:42:27 -04:00
Joshua Boniface f25a80ff53 Add additional CMK checks 2023-09-01 15:42:26 -04:00
Joshua Boniface 8c2d117a3c Wait longer when restarting services
From 15 -> 30 seconds to ensure more time for stabilization before
proceeding with the next.
2023-09-01 15:42:26 -04:00
Joshua Boniface 647ca1c446 Add default features flag to ceph.conf generator
Coupled with the removal of explicit --image-features flags to the RBD
command in PVC itself, this ensures that only the two features supported
on kernel 4.19 are enabled by default.
2023-09-01 15:42:26 -04:00
Joshua Boniface 86eaeed2b4 Fix sources.list for Bullseye 2023-09-01 15:42:26 -04:00
Joshua Boniface 3d64ad2420 Typo fix 2023-09-01 15:42:26 -04:00
Joshua Boniface eaea860b61 Lower autopurge interval to 1 hour 2023-09-01 15:42:26 -04:00
Joshua Boniface 524f857f56 Add some Zookeeper configuration tweaks 2023-09-01 15:42:26 -04:00
Joshua Boniface 13556918d7 Disable any systemd start rate limiting
Because Zookeeper is supremely stupid (see last commit) we want to
disable start limiting. It needs to keep trying forever until it starts.
2023-09-01 15:42:26 -04:00
Joshua Boniface 8eecc95f2f Ensure Zookeeper restarts itself
The Zookeeper daemon does not appear to exit with any status other than
0, even after a fatal error. Work around this.
2023-09-01 15:42:26 -04:00
Joshua Boniface b03ecf0125 Add -XX:+AlwaysPreTouch option for Zookeeper 2023-09-01 15:42:26 -04:00
Joshua Boniface b842276002 Lower keep count for Zookeeper vacuum to 3
Required to keep disk space growth down when using zookeeper_logging
functionality.
2023-09-01 15:42:26 -04:00
Joshua Boniface 681afd1d1b Fix excessive whitespace 2023-09-01 15:42:26 -04:00
Joshua Boniface 2d31e6c8ea Fix memory tuning issues 2023-09-01 15:42:26 -04:00
Joshua Boniface 71b6da6555 Adjust package lists per Debian version 2023-09-01 15:42:26 -04:00
Joshua Boniface 4b0a4ae73c Fix bad Ansible variable name 2023-09-01 15:42:26 -04:00
Joshua Boniface a52d4cbf37 Add Zookeeper logging configs 2023-09-01 15:42:26 -04:00
Joshua Boniface 7bacbd5dd6 Don't fail if IPMI tasks fail 2023-09-01 15:42:26 -04:00
Joshua Boniface eef0f959dd Add GRUB, Plymouth themes and issue for PVC 2023-09-01 15:42:26 -04:00
Joshua Boniface 6d3e5ac728 Fix zkcli for good 2023-09-01 15:42:26 -04:00
Joshua Boniface e760114b8d Fix bootstrap collection path for Ceph 2023-09-01 15:42:26 -04:00
Joshua Boniface bace67b8bf Add GRUB configuration to Ansible role 2023-09-01 15:42:26 -04:00
Joshua Boniface 0802cca980 Support both versions of psycopg2 and kazoo 2023-09-01 15:42:26 -04:00
Joshua Boniface 31a677b444 Fix Patroni ACL to use subnet mask 2023-09-01 15:42:26 -04:00