Joshua Boniface
17f819ea3f
Don't set "latest" for libvirt packages
...
Avoids errors during runs before upgrades.
2023-10-24 10:41:47 -04:00
Joshua Boniface
c11f896a60
Fix zk_status check target znode
2023-10-22 00:42:43 -04:00
Joshua Boniface
f4bbdb7c86
Use full path for uuidgen
2023-09-29 03:00:53 -04:00
Joshua Boniface
83636388f0
Add configurable monitoring interval
2023-09-15 22:31:16 -04:00
Joshua Boniface
8ebb8a8339
Disable autoscale via command
...
As per [1] the ceph.conf option does not work properly and must be set this way.
[1] https://stackoverflow.com/questions/63853436/ceph-octopus-setting-autoscale-mode-from-ceph-conf-file
2023-09-02 01:59:47 -04:00
Joshua Boniface
a10b3e8d4a
Lower default pgs and disable autoscale
2023-09-01 23:54:10 -04:00
Joshua Boniface
cf426408f2
Restore original setting
2023-09-01 16:18:20 -04:00
Joshua Boniface
3680717daa
Remove extra restarts on bootstrap
2023-09-01 15:42:30 -04:00
Joshua Boniface
1f4cd92d63
Fix bad calls to node primary
2023-09-01 15:42:30 -04:00
Joshua Boniface
6da9956811
Fix delegate_to
2023-09-01 15:42:30 -04:00
Joshua Boniface
fb60093750
Ignore errors in Patroni restart handler
2023-09-01 15:42:30 -04:00
Joshua Boniface
7b061966ad
Ignore errors in Patroni
...
Required during upgrades as the service may be masked.
2023-09-01 15:42:30 -04:00
Joshua Boniface
1e497413e8
Remove extra whitespace
2023-09-01 15:42:30 -04:00
Joshua Boniface
64ce09122d
Add additional primary node switch
2023-09-01 15:42:30 -04:00
Joshua Boniface
353399a407
Ensure core pg_hba entries are present
2023-09-01 15:42:30 -04:00
Joshua Boniface
b21778f117
Fix Patroni upgrade and D12 support
2023-09-01 15:42:30 -04:00
Joshua Boniface
9411679004
Fix reboot
2023-09-01 15:42:30 -04:00
Joshua Boniface
7c8b6919fe
Add Debian 12 Patroni config
2023-09-01 15:42:30 -04:00
Joshua Boniface
2ba8f1cfc3
Add retries to all apt commands
2023-09-01 15:42:30 -04:00
Joshua Boniface
d54844746e
Ignore errors enabling vhostmd
...
Seems to cause issues in bookworm.
2023-09-01 15:42:30 -04:00
Joshua Boniface
71d956dab7
Add final pvcnoded restart
2023-09-01 15:42:30 -04:00
Joshua Boniface
f79d1da5be
Update other commands to use new CLI format
2023-09-01 15:42:30 -04:00
Joshua Boniface
0d3e525f12
Update link to one level higher
2023-09-01 15:42:29 -04:00
Joshua Boniface
017e1405ed
Use debian_version custom fact
2023-09-01 15:42:29 -04:00
Joshua Boniface
f8ef2602bc
Revert "Fix symlink to be one level up"
...
This reverts commit 7693b2d78f
.
2023-09-01 15:42:29 -04:00
Joshua Boniface
dcaa0228b7
Fix symlink to be one level up
2023-09-01 15:42:29 -04:00
Joshua Boniface
cab4deac26
Add configuration field for plugins
2023-09-01 15:42:29 -04:00
Joshua Boniface
c2b576334f
Adjust plugin log config field for 0.9.62
2023-09-01 15:42:29 -04:00
Joshua Boniface
561ecb5c61
Adjust name of bootstrap trigger variable
...
The PVC bootstrap framework overrides this variable and wreaks havoc on
it. Instead adjust our side so that it looks for do_bootstrap instead.
2023-09-01 15:42:29 -04:00
Joshua Boniface
5a48ec4d79
Ensure CPU tuning is only applied on Debian 11+
2023-09-01 15:42:29 -04:00
Joshua Boniface
07d75573d6
Add updated tuning configuration
...
Uses a much nicer CPU tuning configuration, leveraging systemd's
AllowedCPUs and CPUAffinity options within a set of slices (some
default, some custom).
Configuration is also greatly simplified versus the previous
implementation, simply asking for a number of CPUS for both the system
and OSDs, and calculating everything else that is required.
Also switches (back) to the v2 unified cgroup hierarchy by default as
required by the systemd AllowedCPUs directive.
2023-09-01 15:42:29 -04:00
Joshua Boniface
fa4f1cff0f
Adjust variable used for migrate selector
2023-09-01 15:42:29 -04:00
Joshua Boniface
1d35fec8a8
Remove cpuset configurations
...
This functionality simply did not work, with Libvirt continuing to dump
its processes into the root cset thus defeating the purpose entirely.
Just remove it, from some very initial testing it isn't worth the
headache.
2023-09-01 15:42:29 -04:00
Joshua Boniface
f51fc2ce64
Fix setting of csets for OSDs
2023-09-01 15:42:29 -04:00
Joshua Boniface
144f519e76
Add rinse dependency for provisioner
2023-09-01 15:42:29 -04:00
Joshua Boniface
be091f66d4
Remove pvc-flush references
...
This service causes more problems than it solves usually, so it is being
removed in the next PVC version.
2023-09-01 15:42:28 -04:00
Joshua Boniface
6cf8948107
Add Ceph support for single-node clusters
...
Ensures that the pool default size/min size is set to something
reasonable for a single node (effective RAID-1) and replace teh default
CRUSH replicate_rule set for this situation with one choosing OSD
instead of host as the default.
2023-09-01 15:42:28 -04:00
Joshua Boniface
e4ccafee73
Add cgroup delegation override
...
Required to solve the occasional
libvirt: QEMU Driver error : Requested operation is not valid:
cgroup CPUACCT controller is not mounted
problem, as per:
https://answers.launchpad.net/ubuntu/+question/665132
2023-09-01 15:42:28 -04:00
Joshua Boniface
25dde4709b
Ensure packages are installed as newhost
2023-09-01 15:42:28 -04:00
Joshua Boniface
4dfd877c7f
Ensure Admin users are in additional groups
2023-09-01 15:42:28 -04:00
Joshua Boniface
9fe43efac2
Convert default libvirtd to template
2023-09-01 15:42:28 -04:00
Joshua Boniface
aa6b4ac3dc
Make locale generation universal
...
Don't rely on a notify/handler, just do it every time in the base role.
2023-09-01 15:42:28 -04:00
Joshua Boniface
91ca3d1510
Ensure insecure_global_id_reclaim is false
2023-09-01 15:42:28 -04:00
Joshua Boniface
3397dacab4
Fix bugs with Patroni bootstrap
2023-09-01 15:42:28 -04:00
Joshua Boniface
1838f8ff56
Add proper PostgreSQL versioning
2023-09-01 15:42:28 -04:00
Joshua Boniface
773fd5a9d4
Ensure all zkCli has -server set
2023-09-01 15:42:28 -04:00
Joshua Boniface
0e9d0b3294
Fix incorrect postgresql version
2023-09-01 15:42:28 -04:00
Joshua Boniface
d6cb28b639
Add immutability to PVC subrole
...
1. Remove the obsolete pvc-vacuum script install.
2. Remove notifies when modifying configs; we do not want to restart the
daemons uncontrolled.
3. Add bootstrap check to package installs so they only happen on
bootstrap.
This ensures this part of the role, on re-runs, will *only* update
configs and not actually touch the running daemon. This makes it safe to
run before a oneshot/update-pvc-daemons.yml playbook run.
2023-09-01 15:42:28 -04:00
Joshua Boniface
77be96bf6f
Fix a few more splits
2023-09-01 15:42:28 -04:00
Joshua Boniface
95b47f8b09
Fix a few more extraneous splits
...
Just use this_node if applicable, or the raw node.hostname.
2023-09-01 15:42:28 -04:00
Joshua Boniface
87803cb7a2
Remove extraneous splits
...
The node.hostname should always be short.
2023-09-01 15:42:28 -04:00
Joshua Boniface
d24cb8a8ef
Unify and standardize inventory_hostname
...
This was causing some confusing conflicts, so create a new fact called
"this_node" which is inventory_hostname.split('.')[0], i.e. the short
name, and use that everywhere instead of an FQDN or true inventory
hostname.
2023-09-01 15:42:28 -04:00
Joshua Boniface
55ec177919
Ignore errors restarting libvirtd
...
This seems to inexplicably fail sometimes. We can just ignore it.
2023-09-01 15:42:27 -04:00
Joshua Boniface
4cb2d7835c
Add setting bridge_mtu to config
2023-09-01 15:42:27 -04:00
Joshua Boniface
6e2d661134
Adjust documentation and behaviour of cpuset
...
1. Detail the caveats and specific situations and ref the documentation
which will provide more details.
2. Always install the configs, but use /etc/default/ceph-osd-cpuset to
control if the script does anything or not (so, the "osd" cset set is
always active just not set in a special way.
2023-09-01 15:42:27 -04:00
Joshua Boniface
83bd1b1efd
Install cset configs even if disabled
...
The setup script handles this instead.
2023-09-01 15:42:27 -04:00
Joshua Boniface
7927ec4f11
Allow dynamic enabling/disabling of cset
...
Add a separate config to handle enable/disable on the system itself.
2023-09-01 15:42:27 -04:00
Joshua Boniface
2ae9b9075a
Adjust default ceph.conf parameters
...
1. Remove an explicit OSD journal size, especially such a small one (no
clue why I ever added that...)
2. Add max scrubs, disable scrub during recovery, and set scrub sleep.
3. Add max backfills, tune recovery sleep to 0 to prioritize recovery.
2023-09-01 15:42:27 -04:00
Joshua Boniface
6e48d6fe84
Add Ceph OSD cpuset tuning options
...
Allows an administrator to set CPU pinning with the cpuset tool for Ceph
OSDs, in situations where CPU contention with VMs or other system tasks
may be negatively affecting OSD performance. This is optional, advanced
tuning and is disabled by default.
2023-09-01 15:42:27 -04:00
Joshua Boniface
f25a80ff53
Add additional CMK checks
2023-09-01 15:42:26 -04:00
Joshua Boniface
8c2d117a3c
Wait longer when restarting services
...
From 15 -> 30 seconds to ensure more time for stabilization before
proceeding with the next.
2023-09-01 15:42:26 -04:00
Joshua Boniface
647ca1c446
Add default features flag to ceph.conf generator
...
Coupled with the removal of explicit --image-features flags to the RBD
command in PVC itself, this ensures that only the two features supported
on kernel 4.19 are enabled by default.
2023-09-01 15:42:26 -04:00
Joshua Boniface
3d64ad2420
Typo fix
2023-09-01 15:42:26 -04:00
Joshua Boniface
eaea860b61
Lower autopurge interval to 1 hour
2023-09-01 15:42:26 -04:00
Joshua Boniface
524f857f56
Add some Zookeeper configuration tweaks
2023-09-01 15:42:26 -04:00
Joshua Boniface
13556918d7
Disable any systemd start rate limiting
...
Because Zookeeper is supremely stupid (see last commit) we want to
disable start limiting. It needs to keep trying forever until it starts.
2023-09-01 15:42:26 -04:00
Joshua Boniface
8eecc95f2f
Ensure Zookeeper restarts itself
...
The Zookeeper daemon does not appear to exit with any status other than
0, even after a fatal error. Work around this.
2023-09-01 15:42:26 -04:00
Joshua Boniface
b03ecf0125
Add -XX:+AlwaysPreTouch option for Zookeeper
2023-09-01 15:42:26 -04:00
Joshua Boniface
b842276002
Lower keep count for Zookeeper vacuum to 3
...
Required to keep disk space growth down when using zookeeper_logging
functionality.
2023-09-01 15:42:26 -04:00
Joshua Boniface
681afd1d1b
Fix excessive whitespace
2023-09-01 15:42:26 -04:00
Joshua Boniface
2d31e6c8ea
Fix memory tuning issues
2023-09-01 15:42:26 -04:00
Joshua Boniface
71b6da6555
Adjust package lists per Debian version
2023-09-01 15:42:26 -04:00
Joshua Boniface
a52d4cbf37
Add Zookeeper logging configs
2023-09-01 15:42:26 -04:00
Joshua Boniface
e760114b8d
Fix bootstrap collection path for Ceph
2023-09-01 15:42:26 -04:00
Joshua Boniface
0802cca980
Support both versions of psycopg2 and kazoo
2023-09-01 15:42:26 -04:00
Joshua Boniface
31a677b444
Fix Patroni ACL to use subnet mask
2023-09-01 15:42:26 -04:00
Joshua Boniface
a2ed38b459
Add generic SR-IOV configuration
2023-09-01 15:42:26 -04:00
Joshua Boniface
388db6ad1d
Use IPs for Patroni configuration
2023-09-01 15:42:26 -04:00
Joshua Boniface
d455b31905
Bump max connections in Zookeeper to 200
2023-09-01 15:42:26 -04:00
Joshua Boniface
f105f0497c
Configure Zookeeper only on Cluster address
2023-09-01 15:42:26 -04:00
Joshua Boniface
7e94dddb4c
Ensure libvirtd restarts when unit changes
2023-09-01 15:42:26 -04:00
Joshua Boniface
0bbb91fc8b
Add override custom libvirtd.service unit
...
This has no functional change on Buster, but on Bullseye this overrides
the stupid socket-based activation shenanigans that the default unit
tries to do, as well as the breaking replacement of the
/etc/default/libvirt variable names.
2023-09-01 15:42:26 -04:00
Joshua Boniface
0114ad8ed5
Add python3 version of psycopg2 explicitly
2023-09-01 15:42:26 -04:00
Joshua Boniface
027a819a83
Move some other tasks to bootstrap role
...
Avoids an issue where the pvcnoded service is stopped on non-bootstrap
runs.
2023-09-01 15:42:25 -04:00
Joshua Boniface
6a61f8f7bf
Update relative path to bootstrap files
2023-09-01 15:42:25 -04:00
Joshua Boniface
4caab67d03
Remove superfluous symlink
2023-09-01 15:42:25 -04:00
Joshua Boniface
39b8229c35
Add libguestfs-tools to libvirt role deps
2023-09-01 15:42:25 -04:00
Joshua Boniface
0bf9c6209c
Fix incorrect systemd enabling in Patroni
2023-09-01 15:42:25 -04:00
Joshua Boniface
4f5dbee8ee
Correct bugs during bootstrap
...
1. Ensure Zookeeper restarts and checks out successfully before
proceeding with other steps.
2. Make sure PVC itself doesn't start prematurely.
2023-09-01 15:42:25 -04:00
Joshua Boniface
26dbd082ef
Retry pgsql bootstrap startup 6 times
...
This will sometimes fail, so retry it several times
2023-09-01 15:42:25 -04:00
Joshua Boniface
e9f08ad100
Retry msgr2 enabling 6 times
...
This will sometimes fail, so retry it several times
2023-09-01 15:42:25 -04:00
Joshua Boniface
a77e41bf7c
Remove invalid timezone entries in postgres conf
2023-09-01 15:42:25 -04:00
Joshua Boniface
cba276e248
Add default values
2023-09-01 15:42:24 -04:00
Joshua Boniface
be94bc134f
Add configurable ZK memory limits
2023-09-01 15:42:24 -04:00
Joshua Boniface
6e74ac44a5
Remove libjemalloc package
2023-09-01 15:42:24 -04:00
Joshua Boniface
2bd5cc5a25
Tune Zookeeper memory usage
...
Use Xms and Xmx=128M to reduce overall Zookeeper memory usage.
2023-09-01 15:42:24 -04:00
Joshua Boniface
b4e36d146a
Add tuning for Ceph OSDs
2023-09-01 15:42:24 -04:00
Joshua Boniface
24764fe704
Don't use libjemalloc for Ceph daemons
...
This was an artifact of a much, much older Ceph configuration I ran, and
is not relevant with newer Ceph versions like those used in PVC.
Performance testing with Nautilus and Bluestore reveals a minimal
performance hit, and using `jemalloc` prevents cache autotuning from
being effective, so remove it.
2023-09-01 15:42:24 -04:00
Joshua Boniface
458e7b4872
Use new init command location
...
Command was renamed in the PVC CLI to facilitate other "task" actions
like backup/restore.
2023-09-01 15:42:24 -04:00
Joshua Boniface
bcb5962353
Add jute.maxbuffer to Zookeeper environment ops
...
Adds this option based on the findings of
https://github.com/python-zk/kazoo/issues/630 , whereby restores of >1MB
in size would fail. This is considered an unsafe option, but given our
usecase no actual znode should ever exceed this limit; this is purely
for the large transactions that come from a `pvc task restore` action to
an empty Zookeeper instance.
2023-09-01 15:42:24 -04:00