Joshua Boniface
efeaa61e0f
Add customizable NTP servers
2023-09-01 15:42:29 -04:00
Joshua Boniface
e9f76042bd
Allow specifying alternate channels in IPMI
2023-09-01 15:42:29 -04:00
Joshua Boniface
cab4deac26
Add configuration field for plugins
2023-09-01 15:42:29 -04:00
Joshua Boniface
34d12ab423
Add Ceph check
2023-09-01 15:42:29 -04:00
Joshua Boniface
c2b576334f
Adjust plugin log config field for 0.9.62
2023-09-01 15:42:29 -04:00
Joshua Boniface
84a3f7afa0
Add edac-utils to packages
2023-09-01 15:42:29 -04:00
Joshua Boniface
612045b8b3
Restore original rsyslog-rotate script
...
Direct call doesn't work because of how arguments are passed to
postrotate.
2023-09-01 15:42:29 -04:00
Joshua Boniface
5cd9566163
Explicitly use systemctl in logrotate
...
For some reason (Debian bug?) the default rsyslog-rotate script was not
properly rotating rsyslog logfiles. Instead, explicitly call systemctl
kill -s HUP for this, using a full path.
2023-09-01 15:42:29 -04:00
Joshua Boniface
57010260bd
Use full debian_version
2023-09-01 15:42:29 -04:00
Joshua Boniface
2a925904e4
Alter format of Debian version in MOTD
2023-09-01 15:42:29 -04:00
Joshua Boniface
561ecb5c61
Adjust name of bootstrap trigger variable
...
The PVC bootstrap framework overrides this variable and wreaks havoc on
it. Instead adjust our side so that it looks for do_bootstrap instead.
2023-09-01 15:42:29 -04:00
Joshua Boniface
a79961605a
Replace per-user htoprc with system-wide config
...
Also update to newer htoprc layout from BLSE.
2023-09-01 15:42:29 -04:00
Joshua Boniface
5a48ec4d79
Ensure CPU tuning is only applied on Debian 11+
2023-09-01 15:42:29 -04:00
Joshua Boniface
07d75573d6
Add updated tuning configuration
...
Uses a much nicer CPU tuning configuration, leveraging systemd's
AllowedCPUs and CPUAffinity options within a set of slices (some
default, some custom).
Configuration is also greatly simplified versus the previous
implementation, simply asking for a number of CPUS for both the system
and OSDs, and calculating everything else that is required.
Also switches (back) to the v2 unified cgroup hierarchy by default as
required by the systemd AllowedCPUs directive.
2023-09-01 15:42:29 -04:00
Joshua Boniface
fa4f1cff0f
Adjust variable used for migrate selector
2023-09-01 15:42:29 -04:00
Joshua Boniface
1d35fec8a8
Remove cpuset configurations
...
This functionality simply did not work, with Libvirt continuing to dump
its processes into the root cset thus defeating the purpose entirely.
Just remove it, from some very initial testing it isn't worth the
headache.
2023-09-01 15:42:29 -04:00
Joshua Boniface
f51fc2ce64
Fix setting of csets for OSDs
2023-09-01 15:42:29 -04:00
Joshua Boniface
8f685116b7
Add Ceph monitor backup
2023-09-01 15:42:29 -04:00
Joshua Boniface
c3ce11dacf
Fix update-motd so it runs properly
2023-09-01 15:42:29 -04:00
Joshua Boniface
267494d58a
Add lm-sensors configuration
2023-09-01 15:42:29 -04:00
Joshua Boniface
d94f587e37
Remove obsolete logrotate settings
2023-09-01 15:42:29 -04:00
Joshua Boniface
73e1f2042c
Add extra space for clarity
2023-09-01 15:42:29 -04:00
Joshua Boniface
30ddeb0fee
Update MOTD automatically on boot
...
The cron every minute was pointlessly excessive considering this doesn't
actually change minute-to-minute.
2023-09-01 15:42:29 -04:00
Joshua Boniface
86026de8ef
Adjust colour scheme of MOTD
2023-09-01 15:42:29 -04:00
Joshua Boniface
8e1d005d43
Obtain more information for MOTD header
...
Add model and serial numbers to the vendor, and put this on its own
line. Also use BASH for proper syntax formatting. Reformat the header to
be a more compact format.
2023-09-01 15:42:29 -04:00
Joshua Boniface
b987c4ea8f
Adjust GRUB_DIST and add UEFI regeneration
...
Keeps the UEFI boot list cleaned and consistent
2023-09-01 15:42:29 -04:00
Joshua Boniface
144f519e76
Add rinse dependency for provisioner
2023-09-01 15:42:29 -04:00
Joshua Boniface
be091f66d4
Remove pvc-flush references
...
This service causes more problems than it solves usually, so it is being
removed in the next PVC version.
2023-09-01 15:42:28 -04:00
Joshua Boniface
08c8be66b3
Increase timeout threshold for freshness
2023-09-01 15:42:28 -04:00
Joshua Boniface
00482aec06
Fix the other instance too
2023-09-01 15:42:28 -04:00
Joshua Boniface
da98a4d445
Ignore errors about removing keys
2023-09-01 15:42:28 -04:00
Joshua Boniface
6cf8948107
Add Ceph support for single-node clusters
...
Ensures that the pool default size/min size is set to something
reasonable for a single node (effective RAID-1) and replace teh default
CRUSH replicate_rule set for this situation with one choosing OSD
instead of host as the default.
2023-09-01 15:42:28 -04:00
Joshua Boniface
e4ccafee73
Add cgroup delegation override
...
Required to solve the occasional
libvirt: QEMU Driver error : Requested operation is not valid:
cgroup CPUACCT controller is not mounted
problem, as per:
https://answers.launchpad.net/ubuntu/+question/665132
2023-09-01 15:42:28 -04:00
Joshua Boniface
e8fe165e00
Further optimize ownership agent output
2023-09-01 15:42:28 -04:00
Joshua Boniface
cbea6e284c
Make ownership check consistent with cmk-agent 2.1
...
The new CheckMK agent uses UID 998 (dynamic) for itself. This causes
ownership problems with the old logic of this check. Move instead to a
range, where the UIDs from 200-599 are reserved for administrators, and
check for this range explicitly. Also eliminates the exceptions for ceph
and 2000 from previous iterations.
2023-09-01 15:42:28 -04:00
Joshua Boniface
9e20e47903
Update freshness checks
2023-09-01 15:42:28 -04:00
Joshua Boniface
d47d320bb3
Replace freshness and kernel_version checks
...
Use an updated plugin from BLSE that uses needrestart instead of manual
parsing of these elements.
2023-09-01 15:42:28 -04:00
Joshua Boniface
ea9fe5570f
Add method to remove inactive SSH keys
2023-09-01 15:42:28 -04:00
Joshua Boniface
25dde4709b
Ensure packages are installed as newhost
2023-09-01 15:42:28 -04:00
Joshua Boniface
4dfd877c7f
Ensure Admin users are in additional groups
2023-09-01 15:42:28 -04:00
Joshua Boniface
ce9304e43e
Populate /etc/timezone as well
2023-09-01 15:42:28 -04:00
Joshua Boniface
9fe43efac2
Convert default libvirtd to template
2023-09-01 15:42:28 -04:00
Joshua Boniface
aa6b4ac3dc
Make locale generation universal
...
Don't rely on a notify/handler, just do it every time in the base role.
2023-09-01 15:42:28 -04:00
Joshua Boniface
91ca3d1510
Ensure insecure_global_id_reclaim is false
2023-09-01 15:42:28 -04:00
Joshua Boniface
3397dacab4
Fix bugs with Patroni bootstrap
2023-09-01 15:42:28 -04:00
Joshua Boniface
1838f8ff56
Add proper PostgreSQL versioning
2023-09-01 15:42:28 -04:00
Joshua Boniface
773fd5a9d4
Ensure all zkCli has -server set
2023-09-01 15:42:28 -04:00
Joshua Boniface
0e9d0b3294
Fix incorrect postgresql version
2023-09-01 15:42:28 -04:00
Joshua Boniface
35dcf979f4
Customize grub distributor
2023-09-01 15:42:28 -04:00
Joshua Boniface
ba81a106d2
Set postfix to listen on all interfaces
...
Binding to just localhost was causing problems.
2023-09-01 15:42:28 -04:00
Joshua Boniface
a87745d640
Fix name of task
2023-09-01 15:42:28 -04:00
Joshua Boniface
d6cb28b639
Add immutability to PVC subrole
...
1. Remove the obsolete pvc-vacuum script install.
2. Remove notifies when modifying configs; we do not want to restart the
daemons uncontrolled.
3. Add bootstrap check to package installs so they only happen on
bootstrap.
This ensures this part of the role, on re-runs, will *only* update
configs and not actually touch the running daemon. This makes it safe to
run before a oneshot/update-pvc-daemons.yml playbook run.
2023-09-01 15:42:28 -04:00
Joshua Boniface
77be96bf6f
Fix a few more splits
2023-09-01 15:42:28 -04:00
Joshua Boniface
95b47f8b09
Fix a few more extraneous splits
...
Just use this_node if applicable, or the raw node.hostname.
2023-09-01 15:42:28 -04:00
Joshua Boniface
87803cb7a2
Remove extraneous splits
...
The node.hostname should always be short.
2023-09-01 15:42:28 -04:00
Joshua Boniface
d24cb8a8ef
Unify and standardize inventory_hostname
...
This was causing some confusing conflicts, so create a new fact called
"this_node" which is inventory_hostname.split('.')[0], i.e. the short
name, and use that everywhere instead of an FQDN or true inventory
hostname.
2023-09-01 15:42:28 -04:00
Joshua Boniface
056c325486
Add option for setting CPU governor
...
Allows the administrator to set a CPU frequency governor if they need
to, though the default of ondemand is usually sufficient.
2023-09-01 15:42:28 -04:00
Joshua Boniface
fc5bcf139c
Fix name of IPMI check again
2023-09-01 15:42:28 -04:00
Joshua Boniface
44cedf66c9
Fix name of ipmi check
2023-09-01 15:42:28 -04:00
Joshua Boniface
9f7dbfb4f8
Add IPMI check to tasks
2023-09-01 15:42:28 -04:00
Joshua Boniface
b9ae4d1009
Adjust headers and add LOM check
2023-09-01 15:42:27 -04:00
Joshua Boniface
48fb21af75
Add node list to PVC MOTD
2023-09-01 15:42:27 -04:00
Joshua Boniface
e009cf4076
Fix whitespaced manufacturer and bad [[
2023-09-01 15:42:27 -04:00
Joshua Boniface
e65f1d15a6
Add coordinator state to MOTD
2023-09-01 15:42:27 -04:00
Joshua Boniface
894ce9b517
Support unknown manufacturers in MOTD
2023-09-01 15:42:27 -04:00
Joshua Boniface
55ec177919
Ignore errors restarting libvirtd
...
This seems to inexplicably fail sometimes. We can just ignore it.
2023-09-01 15:42:27 -04:00
Joshua Boniface
b814ec60f6
Add resolv.conf customization
2023-09-01 15:42:27 -04:00
Joshua Boniface
ddecb94348
Disable unified cgroup heirarchy on kernel cmdline
...
This is required on Debian 11 to use the cset tool, since the newer
systemd implementation of a unified cgroup hierarchy is not compatible
with the cset tool.
Ref for future use:
https://github.com/lpechacek/cpuset/issues/40
2023-09-01 15:42:27 -04:00
Joshua Boniface
be3ce67574
Use inventory_hostname in IPMI fragment
2023-09-01 15:42:27 -04:00
Joshua Boniface
5f05835721
Update bondX configuration
2023-09-01 15:42:27 -04:00
Joshua Boniface
4cb2d7835c
Add setting bridge_mtu to config
2023-09-01 15:42:27 -04:00
Joshua Boniface
9f16995f59
Add smartmontools to base package list
2023-09-01 15:42:27 -04:00
Joshua Boniface
6e2d661134
Adjust documentation and behaviour of cpuset
...
1. Detail the caveats and specific situations and ref the documentation
which will provide more details.
2. Always install the configs, but use /etc/default/ceph-osd-cpuset to
control if the script does anything or not (so, the "osd" cset set is
always active just not set in a special way.
2023-09-01 15:42:27 -04:00
Joshua Boniface
83bd1b1efd
Install cset configs even if disabled
...
The setup script handles this instead.
2023-09-01 15:42:27 -04:00
Joshua Boniface
7927ec4f11
Allow dynamic enabling/disabling of cset
...
Add a separate config to handle enable/disable on the system itself.
2023-09-01 15:42:27 -04:00
Joshua Boniface
2ae9b9075a
Adjust default ceph.conf parameters
...
1. Remove an explicit OSD journal size, especially such a small one (no
clue why I ever added that...)
2. Add max scrubs, disable scrub during recovery, and set scrub sleep.
3. Add max backfills, tune recovery sleep to 0 to prioritize recovery.
2023-09-01 15:42:27 -04:00
Joshua Boniface
6e48d6fe84
Add Ceph OSD cpuset tuning options
...
Allows an administrator to set CPU pinning with the cpuset tool for Ceph
OSDs, in situations where CPU contention with VMs or other system tasks
may be negatively affecting OSD performance. This is optional, advanced
tuning and is disabled by default.
2023-09-01 15:42:27 -04:00
Joshua Boniface
45424a28ce
Fix bad flag
2023-09-01 15:42:27 -04:00
Joshua Boniface
044a14fa6d
Add package installs for different Debian versions
2023-09-01 15:42:27 -04:00
Joshua Boniface
ae40227ea1
Move paths and keys to defaults
2023-09-01 15:42:27 -04:00
Joshua Boniface
f25a80ff53
Add additional CMK checks
2023-09-01 15:42:26 -04:00
Joshua Boniface
8c2d117a3c
Wait longer when restarting services
...
From 15 -> 30 seconds to ensure more time for stabilization before
proceeding with the next.
2023-09-01 15:42:26 -04:00
Joshua Boniface
647ca1c446
Add default features flag to ceph.conf generator
...
Coupled with the removal of explicit --image-features flags to the RBD
command in PVC itself, this ensures that only the two features supported
on kernel 4.19 are enabled by default.
2023-09-01 15:42:26 -04:00
Joshua Boniface
86eaeed2b4
Fix sources.list for Bullseye
2023-09-01 15:42:26 -04:00
Joshua Boniface
3d64ad2420
Typo fix
2023-09-01 15:42:26 -04:00
Joshua Boniface
eaea860b61
Lower autopurge interval to 1 hour
2023-09-01 15:42:26 -04:00
Joshua Boniface
524f857f56
Add some Zookeeper configuration tweaks
2023-09-01 15:42:26 -04:00
Joshua Boniface
13556918d7
Disable any systemd start rate limiting
...
Because Zookeeper is supremely stupid (see last commit) we want to
disable start limiting. It needs to keep trying forever until it starts.
2023-09-01 15:42:26 -04:00
Joshua Boniface
8eecc95f2f
Ensure Zookeeper restarts itself
...
The Zookeeper daemon does not appear to exit with any status other than
0, even after a fatal error. Work around this.
2023-09-01 15:42:26 -04:00
Joshua Boniface
b03ecf0125
Add -XX:+AlwaysPreTouch option for Zookeeper
2023-09-01 15:42:26 -04:00
Joshua Boniface
b842276002
Lower keep count for Zookeeper vacuum to 3
...
Required to keep disk space growth down when using zookeeper_logging
functionality.
2023-09-01 15:42:26 -04:00
Joshua Boniface
681afd1d1b
Fix excessive whitespace
2023-09-01 15:42:26 -04:00
Joshua Boniface
2d31e6c8ea
Fix memory tuning issues
2023-09-01 15:42:26 -04:00
Joshua Boniface
71b6da6555
Adjust package lists per Debian version
2023-09-01 15:42:26 -04:00
Joshua Boniface
4b0a4ae73c
Fix bad Ansible variable name
2023-09-01 15:42:26 -04:00
Joshua Boniface
a52d4cbf37
Add Zookeeper logging configs
2023-09-01 15:42:26 -04:00
Joshua Boniface
7bacbd5dd6
Don't fail if IPMI tasks fail
2023-09-01 15:42:26 -04:00
Joshua Boniface
eef0f959dd
Add GRUB, Plymouth themes and issue for PVC
2023-09-01 15:42:26 -04:00
Joshua Boniface
6d3e5ac728
Fix zkcli for good
2023-09-01 15:42:26 -04:00
Joshua Boniface
e760114b8d
Fix bootstrap collection path for Ceph
2023-09-01 15:42:26 -04:00
Joshua Boniface
bace67b8bf
Add GRUB configuration to Ansible role
2023-09-01 15:42:26 -04:00
Joshua Boniface
0802cca980
Support both versions of psycopg2 and kazoo
2023-09-01 15:42:26 -04:00
Joshua Boniface
31a677b444
Fix Patroni ACL to use subnet mask
2023-09-01 15:42:26 -04:00
Joshua Boniface
35089f6dda
Fix zkcli alias to use hostname
2023-09-01 15:42:26 -04:00
Joshua Boniface
9dc9139c35
Use short ansible_hostname in ipmi fragment
2023-09-01 15:42:26 -04:00
Joshua Boniface
329bc9690e
Add ipmitool to packages list
2023-09-01 15:42:26 -04:00
Joshua Boniface
a2ed38b459
Add generic SR-IOV configuration
2023-09-01 15:42:26 -04:00
Joshua Boniface
0fc889df32
Ensure we can connect to Patroni
2023-09-01 15:42:26 -04:00
Joshua Boniface
388db6ad1d
Use IPs for Patroni configuration
2023-09-01 15:42:26 -04:00
Joshua Boniface
d455b31905
Bump max connections in Zookeeper to 200
2023-09-01 15:42:26 -04:00
Joshua Boniface
f105f0497c
Configure Zookeeper only on Cluster address
2023-09-01 15:42:26 -04:00
Joshua Boniface
7e94dddb4c
Ensure libvirtd restarts when unit changes
2023-09-01 15:42:26 -04:00
Joshua Boniface
c9df64bc7d
Ensure deb-src is present for bullseye
2023-09-01 15:42:26 -04:00
Joshua Boniface
0bbb91fc8b
Add override custom libvirtd.service unit
...
This has no functional change on Buster, but on Bullseye this overrides
the stupid socket-based activation shenanigans that the default unit
tries to do, as well as the breaking replacement of the
/etc/default/libvirt variable names.
2023-09-01 15:42:26 -04:00
Joshua Boniface
3a67dc129b
Ensure DEBIAN_FRONTEND is noninteractive
2023-09-01 15:42:26 -04:00
Joshua Boniface
0114ad8ed5
Add python3 version of psycopg2 explicitly
2023-09-01 15:42:26 -04:00
Joshua Boniface
a548bdcc6a
Use inventory_hostname for IPMI dict
2023-09-01 15:42:26 -04:00
Joshua Boniface
6104e0a5a5
Use independent fact to work around codename
2023-09-01 15:42:26 -04:00
Joshua Boniface
5c46bb0db7
Ensure backup_keys isn't empty
2023-09-01 15:42:25 -04:00
Joshua Boniface
d69770b776
Avoid writing hosts if empty
2023-09-01 15:42:25 -04:00
Joshua Boniface
f4e49b9d3e
Ensure apt-update runs if configs update
2023-09-01 15:42:25 -04:00
Joshua Boniface
9438ab46d7
Add bullseye support
2023-09-01 15:42:25 -04:00
Joshua Boniface
dc83f91bd8
Add directory creation to backup script
2023-09-01 15:42:25 -04:00
Joshua Boniface
5466df7065
Add PostgreSQL to daily backup script
2023-09-01 15:42:25 -04:00
Joshua Boniface
c9742fe2e5
Update tags and fix backup keys to var
2023-09-01 15:42:25 -04:00
Joshua Boniface
7c7ca4a229
Allow inter-cluster orphan NTP sync
...
Due to the requirement of Ceph to have all peer nodes tightly
synchronized with each other to come online, PVC nodes need a way to
synchronize to each other even in the absence of an external time
reference. This is especially prevalent if a set of nodes are left
offline for an extended period (>1-2 weeks), since their hardware clocks
will drift. If the resulting Internet connectivity is then dependent on
a VM, this will cause a catch-22 and the cluster will not properly
start.
This configuration will accomplish that - if no suitable >6 stratum
peers are found, the hosts will enter orphan mode. Since they are now
all configured as "peers" with each other, they will collectively decide
on one of them to become the source and sync to it. A local stratum 10
fudge is added so that at least one of the nodes can become this source.
While this is not an ideal use of NTP, it is by far the cleanest
solution to this problem, and does not impact normal functionality when
the two configured stratum-2 servers are reachable.
2023-09-01 15:42:25 -04:00
Joshua Boniface
027a819a83
Move some other tasks to bootstrap role
...
Avoids an issue where the pvcnoded service is stopped on non-bootstrap
runs.
2023-09-01 15:42:25 -04:00
Joshua Boniface
e53342474c
Remove GRUB config from base role
...
This is not actually ideal.
2023-09-01 15:42:25 -04:00
Joshua Boniface
4666db17cb
Fix version sorting bugs in kernel-cleanup.sh
2023-09-01 15:42:25 -04:00
Joshua Boniface
6903627150
Add additional items to base role
...
Backups, GRUB configuration, and IPMI configuration.
2023-09-01 15:42:25 -04:00
Joshua Boniface
c96ad603b0
Fix sudoers to use conditional deploy_username
2023-09-01 15:42:25 -04:00
Joshua Boniface
29363ebf80
Allow configurable fail2ban IPs
2023-09-01 15:42:25 -04:00
Joshua Boniface
d9be39a048
Allow customization of deploy username
2023-09-01 15:42:25 -04:00
Joshua Boniface
4dc5ebdba0
Move to more dynamic apt configs
...
Allow specifying repository URLs in the group_vars, and add
release-specific template files to support future version changes.
2023-09-01 15:42:25 -04:00
Joshua Boniface
6a61f8f7bf
Update relative path to bootstrap files
2023-09-01 15:42:25 -04:00
Joshua Boniface
4caab67d03
Remove superfluous symlink
2023-09-01 15:42:25 -04:00
Joshua Boniface
57e5953fd1
Add sensible sorting of kernel removals
2023-09-01 15:42:25 -04:00
Joshua Boniface
2a72a826f5
Remove cruft and add mkpasswd setup
2023-09-01 15:42:25 -04:00
Joshua Boniface
bf02da693f
Correct bad indentation in base role
2023-09-01 15:42:25 -04:00
Joshua Boniface
39b8229c35
Add libguestfs-tools to libvirt role deps
2023-09-01 15:42:25 -04:00
Joshua Boniface
1f6cb077fa
Update tags and add kernel-cleanup script
2023-09-01 15:42:25 -04:00
Joshua Boniface
0bf9c6209c
Fix incorrect systemd enabling in Patroni
2023-09-01 15:42:25 -04:00
Joshua Boniface
c0dc6fad4e
Add some additional compression libraries
2023-09-01 15:42:25 -04:00
Joshua Boniface
a4be011884
Add local domain to resolver config
2023-09-01 15:42:25 -04:00
Joshua Boniface
4f5dbee8ee
Correct bugs during bootstrap
...
1. Ensure Zookeeper restarts and checks out successfully before
proceeding with other steps.
2. Make sure PVC itself doesn't start prematurely.
2023-09-01 15:42:25 -04:00
Joshua Boniface
26dbd082ef
Retry pgsql bootstrap startup 6 times
...
This will sometimes fail, so retry it several times
2023-09-01 15:42:25 -04:00
Joshua Boniface
e9f08ad100
Retry msgr2 enabling 6 times
...
This will sometimes fail, so retry it several times
2023-09-01 15:42:25 -04:00
Joshua Boniface
a77e41bf7c
Remove invalid timezone entries in postgres conf
2023-09-01 15:42:25 -04:00
Joshua Boniface
cba276e248
Add default values
2023-09-01 15:42:24 -04:00
Joshua Boniface
be94bc134f
Add configurable ZK memory limits
2023-09-01 15:42:24 -04:00