176 Commits

Author SHA1 Message Date
3397dacab4 Fix bugs with Patroni bootstrap 2023-09-01 15:42:28 -04:00
1838f8ff56 Add proper PostgreSQL versioning 2023-09-01 15:42:28 -04:00
773fd5a9d4 Ensure all zkCli has -server set 2023-09-01 15:42:28 -04:00
0e9d0b3294 Fix incorrect postgresql version 2023-09-01 15:42:28 -04:00
d6cb28b639 Add immutability to PVC subrole
1. Remove the obsolete pvc-vacuum script install.

2. Remove notifies when modifying configs; we do not want to restart the
daemons uncontrolled.

3. Add bootstrap check to package installs so they only happen on
bootstrap.

This ensures this part of the role, on re-runs, will *only* update
configs and not actually touch the running daemon. This makes it safe to
run before a oneshot/update-pvc-daemons.yml playbook run.
2023-09-01 15:42:28 -04:00
77be96bf6f Fix a few more splits 2023-09-01 15:42:28 -04:00
95b47f8b09 Fix a few more extraneous splits
Just use this_node if applicable, or the raw node.hostname.
2023-09-01 15:42:28 -04:00
87803cb7a2 Remove extraneous splits
The node.hostname should always be short.
2023-09-01 15:42:28 -04:00
d24cb8a8ef Unify and standardize inventory_hostname
This was causing some confusing conflicts, so create a new fact called
"this_node" which is inventory_hostname.split('.')[0], i.e. the short
name, and use that everywhere instead of an FQDN or true inventory
hostname.
2023-09-01 15:42:28 -04:00
55ec177919 Ignore errors restarting libvirtd
This seems to inexplicably fail sometimes. We can just ignore it.
2023-09-01 15:42:27 -04:00
4cb2d7835c Add setting bridge_mtu to config 2023-09-01 15:42:27 -04:00
6e2d661134 Adjust documentation and behaviour of cpuset
1. Detail the caveats and specific situations and ref the documentation
which will provide more details.

2. Always install the configs, but use /etc/default/ceph-osd-cpuset to
control if the script does anything or not (so, the "osd" cset set is
always active just not set in a special way.
2023-09-01 15:42:27 -04:00
83bd1b1efd Install cset configs even if disabled
The setup script handles this instead.
2023-09-01 15:42:27 -04:00
7927ec4f11 Allow dynamic enabling/disabling of cset
Add a separate config to handle enable/disable on the system itself.
2023-09-01 15:42:27 -04:00
2ae9b9075a Adjust default ceph.conf parameters
1. Remove an explicit OSD journal size, especially such a small one (no
clue why I ever added that...)

2. Add max scrubs, disable scrub during recovery, and set scrub sleep.

3. Add max backfills, tune recovery sleep to 0 to prioritize recovery.
2023-09-01 15:42:27 -04:00
6e48d6fe84 Add Ceph OSD cpuset tuning options
Allows an administrator to set CPU pinning with the cpuset tool for Ceph
OSDs, in situations where CPU contention with VMs or other system tasks
may be negatively affecting OSD performance. This is optional, advanced
tuning and is disabled by default.
2023-09-01 15:42:27 -04:00
f25a80ff53 Add additional CMK checks 2023-09-01 15:42:26 -04:00
8c2d117a3c Wait longer when restarting services
From 15 -> 30 seconds to ensure more time for stabilization before
proceeding with the next.
2023-09-01 15:42:26 -04:00
647ca1c446 Add default features flag to ceph.conf generator
Coupled with the removal of explicit --image-features flags to the RBD
command in PVC itself, this ensures that only the two features supported
on kernel 4.19 are enabled by default.
2023-09-01 15:42:26 -04:00
3d64ad2420 Typo fix 2023-09-01 15:42:26 -04:00
eaea860b61 Lower autopurge interval to 1 hour 2023-09-01 15:42:26 -04:00
524f857f56 Add some Zookeeper configuration tweaks 2023-09-01 15:42:26 -04:00
13556918d7 Disable any systemd start rate limiting
Because Zookeeper is supremely stupid (see last commit) we want to
disable start limiting. It needs to keep trying forever until it starts.
2023-09-01 15:42:26 -04:00
8eecc95f2f Ensure Zookeeper restarts itself
The Zookeeper daemon does not appear to exit with any status other than
0, even after a fatal error. Work around this.
2023-09-01 15:42:26 -04:00
b03ecf0125 Add -XX:+AlwaysPreTouch option for Zookeeper 2023-09-01 15:42:26 -04:00
b842276002 Lower keep count for Zookeeper vacuum to 3
Required to keep disk space growth down when using zookeeper_logging
functionality.
2023-09-01 15:42:26 -04:00
681afd1d1b Fix excessive whitespace 2023-09-01 15:42:26 -04:00
2d31e6c8ea Fix memory tuning issues 2023-09-01 15:42:26 -04:00
71b6da6555 Adjust package lists per Debian version 2023-09-01 15:42:26 -04:00
a52d4cbf37 Add Zookeeper logging configs 2023-09-01 15:42:26 -04:00
e760114b8d Fix bootstrap collection path for Ceph 2023-09-01 15:42:26 -04:00
0802cca980 Support both versions of psycopg2 and kazoo 2023-09-01 15:42:26 -04:00
31a677b444 Fix Patroni ACL to use subnet mask 2023-09-01 15:42:26 -04:00
a2ed38b459 Add generic SR-IOV configuration 2023-09-01 15:42:26 -04:00
388db6ad1d Use IPs for Patroni configuration 2023-09-01 15:42:26 -04:00
d455b31905 Bump max connections in Zookeeper to 200 2023-09-01 15:42:26 -04:00
f105f0497c Configure Zookeeper only on Cluster address 2023-09-01 15:42:26 -04:00
7e94dddb4c Ensure libvirtd restarts when unit changes 2023-09-01 15:42:26 -04:00
0bbb91fc8b Add override custom libvirtd.service unit
This has no functional change on Buster, but on Bullseye this overrides
the stupid socket-based activation shenanigans that the default unit
tries to do, as well as the breaking replacement of the
/etc/default/libvirt variable names.
2023-09-01 15:42:26 -04:00
0114ad8ed5 Add python3 version of psycopg2 explicitly 2023-09-01 15:42:26 -04:00
027a819a83 Move some other tasks to bootstrap role
Avoids an issue where the pvcnoded service is stopped on non-bootstrap
runs.
2023-09-01 15:42:25 -04:00
6a61f8f7bf Update relative path to bootstrap files 2023-09-01 15:42:25 -04:00
4caab67d03 Remove superfluous symlink 2023-09-01 15:42:25 -04:00
39b8229c35 Add libguestfs-tools to libvirt role deps 2023-09-01 15:42:25 -04:00
0bf9c6209c Fix incorrect systemd enabling in Patroni 2023-09-01 15:42:25 -04:00
4f5dbee8ee Correct bugs during bootstrap
1. Ensure Zookeeper restarts and checks out successfully before
proceeding with other steps.
2. Make sure PVC itself doesn't start prematurely.
2023-09-01 15:42:25 -04:00
26dbd082ef Retry pgsql bootstrap startup 6 times
This will sometimes fail, so retry it several times
2023-09-01 15:42:25 -04:00
e9f08ad100 Retry msgr2 enabling 6 times
This will sometimes fail, so retry it several times
2023-09-01 15:42:25 -04:00
a77e41bf7c Remove invalid timezone entries in postgres conf 2023-09-01 15:42:25 -04:00
cba276e248 Add default values 2023-09-01 15:42:24 -04:00