Commit Graph

163 Commits

Author SHA1 Message Date
Joshua Boniface 5a48ec4d79 Ensure CPU tuning is only applied on Debian 11+ 2023-09-01 15:42:29 -04:00
Joshua Boniface 07d75573d6 Add updated tuning configuration
Uses a much nicer CPU tuning configuration, leveraging systemd's
AllowedCPUs and CPUAffinity options within a set of slices (some
default, some custom).

Configuration is also greatly simplified versus the previous
implementation, simply asking for a number of CPUS for both the system
and OSDs, and calculating everything else that is required.

Also switches (back) to the v2 unified cgroup hierarchy by default as
required by the systemd AllowedCPUs directive.
2023-09-01 15:42:29 -04:00
Joshua Boniface 1d35fec8a8 Remove cpuset configurations
This functionality simply did not work, with Libvirt continuing to dump
its processes into the root cset thus defeating the purpose entirely.

Just remove it, from some very initial testing it isn't worth the
headache.
2023-09-01 15:42:29 -04:00
Joshua Boniface 144f519e76 Add rinse dependency for provisioner 2023-09-01 15:42:29 -04:00
Joshua Boniface be091f66d4 Remove pvc-flush references
This service causes more problems than it solves usually, so it is being
removed in the next PVC version.
2023-09-01 15:42:28 -04:00
Joshua Boniface 6cf8948107 Add Ceph support for single-node clusters
Ensures that the pool default size/min size is set to something
reasonable for a single node (effective RAID-1) and replace teh default
CRUSH replicate_rule set for this situation with one choosing OSD
instead of host as the default.
2023-09-01 15:42:28 -04:00
Joshua Boniface e4ccafee73 Add cgroup delegation override
Required to solve the occasional
  libvirt: QEMU Driver error : Requested operation is not valid:
  cgroup CPUACCT controller is not mounted
problem, as per:
  https://answers.launchpad.net/ubuntu/+question/665132
2023-09-01 15:42:28 -04:00
Joshua Boniface 25dde4709b Ensure packages are installed as newhost 2023-09-01 15:42:28 -04:00
Joshua Boniface 4dfd877c7f Ensure Admin users are in additional groups 2023-09-01 15:42:28 -04:00
Joshua Boniface 9fe43efac2 Convert default libvirtd to template 2023-09-01 15:42:28 -04:00
Joshua Boniface aa6b4ac3dc Make locale generation universal
Don't rely on a notify/handler, just do it every time in the base role.
2023-09-01 15:42:28 -04:00
Joshua Boniface 91ca3d1510 Ensure insecure_global_id_reclaim is false 2023-09-01 15:42:28 -04:00
Joshua Boniface 3397dacab4 Fix bugs with Patroni bootstrap 2023-09-01 15:42:28 -04:00
Joshua Boniface 1838f8ff56 Add proper PostgreSQL versioning 2023-09-01 15:42:28 -04:00
Joshua Boniface 773fd5a9d4 Ensure all zkCli has -server set 2023-09-01 15:42:28 -04:00
Joshua Boniface 0e9d0b3294 Fix incorrect postgresql version 2023-09-01 15:42:28 -04:00
Joshua Boniface d6cb28b639 Add immutability to PVC subrole
1. Remove the obsolete pvc-vacuum script install.

2. Remove notifies when modifying configs; we do not want to restart the
daemons uncontrolled.

3. Add bootstrap check to package installs so they only happen on
bootstrap.

This ensures this part of the role, on re-runs, will *only* update
configs and not actually touch the running daemon. This makes it safe to
run before a oneshot/update-pvc-daemons.yml playbook run.
2023-09-01 15:42:28 -04:00
Joshua Boniface 95b47f8b09 Fix a few more extraneous splits
Just use this_node if applicable, or the raw node.hostname.
2023-09-01 15:42:28 -04:00
Joshua Boniface d24cb8a8ef Unify and standardize inventory_hostname
This was causing some confusing conflicts, so create a new fact called
"this_node" which is inventory_hostname.split('.')[0], i.e. the short
name, and use that everywhere instead of an FQDN or true inventory
hostname.
2023-09-01 15:42:28 -04:00
Joshua Boniface 6e2d661134 Adjust documentation and behaviour of cpuset
1. Detail the caveats and specific situations and ref the documentation
which will provide more details.

2. Always install the configs, but use /etc/default/ceph-osd-cpuset to
control if the script does anything or not (so, the "osd" cset set is
always active just not set in a special way.
2023-09-01 15:42:27 -04:00
Joshua Boniface 83bd1b1efd Install cset configs even if disabled
The setup script handles this instead.
2023-09-01 15:42:27 -04:00
Joshua Boniface 7927ec4f11 Allow dynamic enabling/disabling of cset
Add a separate config to handle enable/disable on the system itself.
2023-09-01 15:42:27 -04:00
Joshua Boniface 2ae9b9075a Adjust default ceph.conf parameters
1. Remove an explicit OSD journal size, especially such a small one (no
clue why I ever added that...)

2. Add max scrubs, disable scrub during recovery, and set scrub sleep.

3. Add max backfills, tune recovery sleep to 0 to prioritize recovery.
2023-09-01 15:42:27 -04:00
Joshua Boniface 6e48d6fe84 Add Ceph OSD cpuset tuning options
Allows an administrator to set CPU pinning with the cpuset tool for Ceph
OSDs, in situations where CPU contention with VMs or other system tasks
may be negatively affecting OSD performance. This is optional, advanced
tuning and is disabled by default.
2023-09-01 15:42:27 -04:00
Joshua Boniface f25a80ff53 Add additional CMK checks 2023-09-01 15:42:26 -04:00
Joshua Boniface 647ca1c446 Add default features flag to ceph.conf generator
Coupled with the removal of explicit --image-features flags to the RBD
command in PVC itself, this ensures that only the two features supported
on kernel 4.19 are enabled by default.
2023-09-01 15:42:26 -04:00
Joshua Boniface 681afd1d1b Fix excessive whitespace 2023-09-01 15:42:26 -04:00
Joshua Boniface 71b6da6555 Adjust package lists per Debian version 2023-09-01 15:42:26 -04:00
Joshua Boniface e760114b8d Fix bootstrap collection path for Ceph 2023-09-01 15:42:26 -04:00
Joshua Boniface 0802cca980 Support both versions of psycopg2 and kazoo 2023-09-01 15:42:26 -04:00
Joshua Boniface 7e94dddb4c Ensure libvirtd restarts when unit changes 2023-09-01 15:42:26 -04:00
Joshua Boniface 0bbb91fc8b Add override custom libvirtd.service unit
This has no functional change on Buster, but on Bullseye this overrides
the stupid socket-based activation shenanigans that the default unit
tries to do, as well as the breaking replacement of the
/etc/default/libvirt variable names.
2023-09-01 15:42:26 -04:00
Joshua Boniface 0114ad8ed5 Add python3 version of psycopg2 explicitly 2023-09-01 15:42:26 -04:00
Joshua Boniface 027a819a83 Move some other tasks to bootstrap role
Avoids an issue where the pvcnoded service is stopped on non-bootstrap
runs.
2023-09-01 15:42:25 -04:00
Joshua Boniface 6a61f8f7bf Update relative path to bootstrap files 2023-09-01 15:42:25 -04:00
Joshua Boniface 39b8229c35 Add libguestfs-tools to libvirt role deps 2023-09-01 15:42:25 -04:00
Joshua Boniface 0bf9c6209c Fix incorrect systemd enabling in Patroni 2023-09-01 15:42:25 -04:00
Joshua Boniface 4f5dbee8ee Correct bugs during bootstrap
1. Ensure Zookeeper restarts and checks out successfully before
proceeding with other steps.
2. Make sure PVC itself doesn't start prematurely.
2023-09-01 15:42:25 -04:00
Joshua Boniface 26dbd082ef Retry pgsql bootstrap startup 6 times
This will sometimes fail, so retry it several times
2023-09-01 15:42:25 -04:00
Joshua Boniface e9f08ad100 Retry msgr2 enabling 6 times
This will sometimes fail, so retry it several times
2023-09-01 15:42:25 -04:00
Joshua Boniface 6e74ac44a5 Remove libjemalloc package 2023-09-01 15:42:24 -04:00
Joshua Boniface b4e36d146a Add tuning for Ceph OSDs 2023-09-01 15:42:24 -04:00
Joshua Boniface 458e7b4872 Use new init command location
Command was renamed in the PVC CLI to facilitate other "task" actions
like backup/restore.
2023-09-01 15:42:24 -04:00
Joshua Boniface f79fb605de Support using existing SSL certs on system
Add the additional pvc_api_ssl_cert_path and pvc_api_ssl_key_path
group_vars options, which can be used to set the SSL details to existing
files on the filesystem if desired. If these are empty (or nonexistent),
the original pvc_api_ssl_cert and pvc_api_ssl_key raw format options
will be used as they were.

Allows the administrator to use outside methods (such as Let's Encrypt)
to obtain the certs locally on the system, avoiding changes to the
group_vars and redeployment to manage SSL keys.
2023-09-01 15:42:24 -04:00
Joshua Boniface 2caed2ae12 Rename remaining "pvc_prov" items to pvc_api 2023-09-01 15:42:24 -04:00
Joshua Boniface fbbf5ffe09 Use cluster_group variable for paths
Instead of trying to automagic this group out of the Ansible hostvars,
just make it explicitly defined in the group_vars to avoid any
confusion.
2023-09-01 15:42:23 -04:00
Joshua Boniface a925e4bd40 Ignore errors in bringing up bootstrap interfaces 2023-09-01 15:42:23 -04:00
Joshua Boniface 12d50cfca6 Use correct syntax for init command 2023-09-01 15:42:23 -04:00
Joshua Boniface 6a3c32f306 Use local CLI command instead of API to init 2023-09-01 15:42:23 -04:00
Joshua Boniface c71415317a Use only short names in Ceph MON config 2023-09-01 15:42:23 -04:00
Joshua Boniface 91313e848e Handle bridge creation more sensibly 2023-09-01 15:42:23 -04:00
Joshua Boniface 0d9e209b45 Allow deb migrations to be installed 2023-09-01 15:42:23 -04:00
Joshua Boniface 8c15edd75c Handle creation and collection on bootstrap better 2023-09-01 15:42:23 -04:00
Joshua Boniface b4079cae88 Use new in-built database migrations in API 2023-09-01 15:42:23 -04:00
Joshua Boniface 0e5cb688dc Use new package and file names
References parallelvirtualclient/pvc#79
2023-09-01 15:42:23 -04:00
Joshua Boniface 999e50a68f Don't mess with upstream at all during bootstrap
This caused some major breakage and is not required.
2023-09-01 15:42:23 -04:00
Joshua Boniface 42d76618e3 Modify add_cluster_ips to support new bridges 2023-09-01 15:42:22 -04:00
Joshua Boniface 32b719cb4a Enable and start vhostmd service 2023-09-01 15:42:22 -04:00
Joshua Boniface bc1d9cd33b Set msgr2 mode on Ceph monitors 2023-09-01 15:42:22 -04:00
Joshua Boniface ba7270ab23 Add and remove floating IP during cluster bootstrap 2023-09-01 15:42:22 -04:00
Joshua Boniface 9546f34c34 Move netmask to separate config part 3 2023-09-01 15:42:22 -04:00
Joshua Boniface 211f83995b Ensure the Patroni ZK is clean for bootstrap 2023-09-01 15:42:22 -04:00
Joshua Boniface c27244f72d Move netmask to separate config part 2 2023-09-01 15:42:22 -04:00
Joshua Boniface e76dc2b796 Use API endpoint to bootstrap PVC cluster 2023-09-01 15:42:22 -04:00
Joshua Boniface da24aaf5ff Install Provisioner schema to database 2023-09-01 15:42:22 -04:00
Joshua Boniface f76802be6d Remove invalid flag to ceph-authtool 2023-09-01 15:42:22 -04:00
Joshua Boniface 4b488a56ea Don't become for uuidgen 2023-09-01 15:42:22 -04:00
Joshua Boniface ff68f8a2a5 Move Ceph access to storage network 2023-09-01 15:42:22 -04:00
Joshua Boniface 9448cf3d90 Add jq dependency 2023-09-01 15:42:21 -04:00
Joshua Boniface 7689e659fe Make vacuum script more comprehensive 2023-09-01 15:42:21 -04:00
Joshua Boniface 94ef3490ab Add daily Zookeeper data cleanup 2023-09-01 15:42:21 -04:00
Joshua Boniface 15a2bf1418 Add custom systemd unit for Zookeeper
We're 100% systemd here, and the lack of control/information that the
old-school ZK initscript provides is frustrating. Replace it with our
own simple unit file.
2023-09-01 15:42:21 -04:00
Joshua Boniface f98a2ee433 Add logrotate configuration 2023-09-01 15:42:21 -04:00
Joshua Boniface c0acd3e994 Add daily Postgres vacuum script 2023-09-01 15:42:21 -04:00
Joshua Boniface 823310e8a3 Limit database tasks to coordinators only
Non-coordinators don't need these configurations as they shouldn't run
there.
2023-09-01 15:42:21 -04:00
Joshua Boniface db3198aadc Bring up underlying interfaces 2023-09-01 15:42:21 -04:00
Joshua Boniface 5d3de3ece2 Complete configuration of API via Ansible 2023-09-01 15:42:21 -04:00
Joshua Boniface d5516d891c Add client API to configuration 2023-09-01 15:42:21 -04:00
Joshua Boniface cfbe724458 Install ethtool 2023-09-01 15:42:21 -04:00
Joshua Boniface f82bb6a414 Add debootstrap to package list 2023-09-01 15:42:20 -04:00
Joshua Boniface d8e9b5353f Don't try to set pool limits on libvirt key
I figured a * wildcard would work, but no it doesn't. Libvirt needs
the ability to talk to any pool arbitrarily since PVC can create and
remove them at will.
2023-09-01 15:42:20 -04:00
Joshua Boniface 0352dd7f8f Create mgr after starting monitors 2023-09-01 15:42:20 -04:00
Joshua Boniface dbf6e52f3c Split PVC bootstrap into separate task 2023-09-01 15:42:20 -04:00
Joshua Boniface 935b4c48ae Correct bug with libvirt permissions 2023-09-01 15:42:20 -04:00
Joshua Boniface 958d2525da Handle restarting ceph-mon/mgr sequentially 2023-09-01 15:42:20 -04:00
Joshua Boniface 596ce789b1 Enable pool deletion in ceph.conf 2023-09-01 15:42:20 -04:00
Joshua Boniface e9303c1ad1 Create manager auth keyring 2023-09-01 15:42:20 -04:00
Joshua Boniface bcce7f5445 Remove per-host pvc.yml for good 2023-09-01 15:42:20 -04:00
Joshua Boniface aef72555c1 Consistent newhost format between roles 2023-09-01 15:42:20 -04:00
Joshua Boniface 9b457890d5 Use separate bootstrap files for base and pvc roles 2023-09-01 15:42:20 -04:00
Joshua Boniface 6dc57f374b Revert "Keep zookeeper enabled"
This reverts commit 5554418210.

This is not needed
2023-09-01 15:42:20 -04:00
Joshua Boniface ebcd281490 Keep zookeeper enabled
Without this, the service seems to just loop failing to start
indefinitely even though PVC attempts to start the daemon itself.
Reenabling seems to work. Likely a bug due to Zookeeper not being
a proper systemd unit.
2023-09-01 15:42:20 -04:00
Joshua Boniface a01720a09d Ensure Ceph daemons are disabled (managed by PVC) 2023-09-01 15:42:20 -04:00
Joshua Boniface a19d9c77ad Clean up some tasks during bootstrap; parallel PVC 2023-09-01 15:42:20 -04:00
Joshua Boniface 218cec1126 Start Zookeeper during install 2023-09-01 15:42:20 -04:00
Joshua Boniface e9fc24a8a8 Don't start pvc services on install 2023-09-01 15:42:20 -04:00
Joshua Boniface f823d1b351 Touch the bootstrap ceph.conf 2023-09-01 15:42:20 -04:00
Joshua Boniface aa72bb9bac Move IP removal and restart after install 2023-09-01 15:42:20 -04:00
Joshua Boniface efd8dce53d Simplify and combine 2023-09-01 15:42:20 -04:00
Joshua Boniface 256a89d7cc Reorganize some elements 2023-09-01 15:42:19 -04:00