Joshua Boniface
e4e084cc5b
Fix name of task
2021-11-15 14:46:44 -05:00
Joshua Boniface
bea79b5102
Add immutability to PVC subrole
...
1. Remove the obsolete pvc-vacuum script install.
2. Remove notifies when modifying configs; we do not want to restart the
daemons uncontrolled.
3. Add bootstrap check to package installs so they only happen on
bootstrap.
This ensures this part of the role, on re-runs, will *only* update
configs and not actually touch the running daemon. This makes it safe to
run before a oneshot/update-pvc-daemons.yml playbook run.
2021-11-15 10:51:38 -05:00
Joshua Boniface
bb3b7e3922
Fix a few more splits
2021-11-11 17:37:27 -05:00
Joshua Boniface
414678f683
Fix a few more extraneous splits
...
Just use this_node if applicable, or the raw node.hostname.
2021-11-11 17:35:42 -05:00
Joshua Boniface
b24e539252
Remove extraneous splits
...
The node.hostname should always be short.
2021-11-11 17:31:56 -05:00
Joshua Boniface
243c910d6d
Unify and standardize inventory_hostname
...
This was causing some confusing conflicts, so create a new fact called
"this_node" which is inventory_hostname.split('.')[0], i.e. the short
name, and use that everywhere instead of an FQDN or true inventory
hostname.
2021-11-11 17:19:03 -05:00
Joshua Boniface
fed71d7add
Add option for setting CPU governor
...
Allows the administrator to set a CPU frequency governor if they need
to, though the default of ondemand is usually sufficient.
2021-11-08 00:21:58 -05:00
Joshua Boniface
dd60b6b9ea
Fix name of IPMI check again
2021-11-02 22:21:16 -04:00
Joshua Boniface
99682c16a2
Fix name of ipmi check
2021-11-02 22:16:47 -04:00
Joshua Boniface
319ca891d5
Add IPMI check to tasks
2021-11-02 22:04:51 -04:00
Joshua Boniface
b7bca571a8
Adjust headers and add LOM check
2021-11-02 22:04:27 -04:00
Joshua Boniface
bd98fdfbd8
Add node list to PVC MOTD
2021-11-02 22:04:27 -04:00
Joshua Boniface
079013dfbc
Fix whitespaced manufacturer and bad [[
2021-10-11 15:08:04 -04:00
Joshua Boniface
8c3b5d7dab
Add coordinator state to MOTD
2021-10-11 15:05:01 -04:00
Joshua Boniface
cb6199ef0d
Support unknown manufacturers in MOTD
2021-10-11 14:59:55 -04:00
Joshua Boniface
34a016bdac
Ignore errors restarting libvirtd
...
This seems to inexplicably fail sometimes. We can just ignore it.
2021-10-11 14:47:04 -04:00
Joshua Boniface
739c60fce0
Add resolv.conf customization
2021-10-11 14:41:29 -04:00
Joshua Boniface
3de777a036
Disable unified cgroup heirarchy on kernel cmdline
...
This is required on Debian 11 to use the cset tool, since the newer
systemd implementation of a unified cgroup hierarchy is not compatible
with the cset tool.
Ref for future use:
https://github.com/lpechacek/cpuset/issues/40
2021-10-10 03:44:13 -04:00
Joshua Boniface
f0f3960250
Use inventory_hostname in IPMI fragment
2021-10-10 02:57:54 -04:00
Joshua Boniface
5ab40fa15f
Update bondX configuration
2021-10-10 02:31:47 -04:00
Joshua Boniface
2c0e09f657
Add setting bridge_mtu to config
2021-10-09 19:29:22 -04:00
Joshua Boniface
859cfbb51e
Add smartmontools to base package list
2021-10-07 15:18:45 -04:00
Joshua Boniface
5797535997
Adjust documentation and behaviour of cpuset
...
1. Detail the caveats and specific situations and ref the documentation
which will provide more details.
2. Always install the configs, but use /etc/default/ceph-osd-cpuset to
control if the script does anything or not (so, the "osd" cset set is
always active just not set in a special way.
2021-09-29 20:49:00 -04:00
Joshua Boniface
81cf341c32
Install cset configs even if disabled
...
The setup script handles this instead.
2021-09-29 10:23:01 -04:00
Joshua Boniface
645249b57e
Allow dynamic enabling/disabling of cset
...
Add a separate config to handle enable/disable on the system itself.
2021-09-29 10:21:47 -04:00
Joshua Boniface
8ac2a5ea0c
Adjust default ceph.conf parameters
...
1. Remove an explicit OSD journal size, especially such a small one (no
clue why I ever added that...)
2. Add max scrubs, disable scrub during recovery, and set scrub sleep.
3. Add max backfills, tune recovery sleep to 0 to prioritize recovery.
2021-09-28 02:09:50 -04:00
Joshua Boniface
732bfe732c
Add Ceph OSD cpuset tuning options
...
Allows an administrator to set CPU pinning with the cpuset tool for Ceph
OSDs, in situations where CPU contention with VMs or other system tasks
may be negatively affecting OSD performance. This is optional, advanced
tuning and is disabled by default.
2021-09-27 00:27:57 -04:00
Joshua Boniface
d7b07925bb
Fix bad flag
2021-09-09 13:07:15 -04:00
Joshua Boniface
77c84cec52
Add package installs for different Debian versions
2021-09-09 12:59:18 -04:00
Joshua Boniface
a91112fa71
Move paths and keys to defaults
2021-08-24 15:25:42 -04:00
Joshua Boniface
2e9d02ab52
Add additional CMK checks
2021-08-21 15:41:44 -04:00
Joshua Boniface
b37d6c3009
Wait longer when restarting services
...
From 15 -> 30 seconds to ensure more time for stabilization before
proceeding with the next.
2021-07-30 11:46:49 -04:00
Joshua Boniface
b62731199f
Add default features flag to ceph.conf generator
...
Coupled with the removal of explicit --image-features flags to the RBD
command in PVC itself, this ensures that only the two features supported
on kernel 4.19 are enabled by default.
2021-07-30 11:39:24 -04:00
Joshua Boniface
2cc4548af6
Fix sources.list for Bullseye
2021-07-26 00:36:39 -04:00
Joshua Boniface
dd2fe47881
Typo fix
2021-07-20 13:59:47 -04:00
Joshua Boniface
9e42e6ae88
Lower autopurge interval to 1 hour
2021-07-20 13:57:59 -04:00
Joshua Boniface
13dd41bb3e
Add some Zookeeper configuration tweaks
2021-07-19 16:31:40 -04:00
Joshua Boniface
f294817b55
Disable any systemd start rate limiting
...
Because Zookeeper is supremely stupid (see last commit) we want to
disable start limiting. It needs to keep trying forever until it starts.
2021-07-19 13:21:16 -04:00
Joshua Boniface
b112663ef0
Ensure Zookeeper restarts itself
...
The Zookeeper daemon does not appear to exit with any status other than
0, even after a fatal error. Work around this.
2021-07-19 13:03:09 -04:00
Joshua Boniface
bd4d94568e
Add -XX:+AlwaysPreTouch option for Zookeeper
2021-07-19 12:46:21 -04:00
Joshua Boniface
e232ab00da
Lower keep count for Zookeeper vacuum to 3
...
Required to keep disk space growth down when using zookeeper_logging
functionality.
2021-07-19 09:51:07 -04:00
Joshua Boniface
3adacf3107
Fix excessive whitespace
2021-07-18 22:13:09 -04:00
Joshua Boniface
764c2c3928
Fix memory tuning issues
2021-07-18 18:51:21 -04:00
Joshua Boniface
10a1754285
Adjust package lists per Debian version
2021-07-18 18:36:58 -04:00
Joshua Boniface
b33096202e
Fix bad Ansible variable name
2021-07-18 17:49:42 -04:00
Joshua Boniface
0e046b48d4
Add Zookeeper logging configs
2021-07-18 17:47:02 -04:00
Joshua Boniface
a1362c4363
Don't fail if IPMI tasks fail
2021-07-07 10:42:30 -04:00
Joshua Boniface
96544aabb8
Add GRUB, Plymouth themes and issue for PVC
2021-06-30 02:50:18 -04:00
Joshua Boniface
9d4eb89bde
Fix zkcli for good
2021-06-29 18:16:02 -04:00
Joshua Boniface
c0ad9740f4
Fix bootstrap collection path for Ceph
2021-06-29 17:52:21 -04:00
Joshua Boniface
3d47b12b76
Add GRUB configuration to Ansible role
2021-06-29 17:48:55 -04:00
Joshua Boniface
120871ee45
Support both versions of psycopg2 and kazoo
2021-06-29 17:29:01 -04:00
Joshua Boniface
231cb7b2aa
Fix Patroni ACL to use subnet mask
2021-06-29 16:47:55 -04:00
Joshua Boniface
d794197633
Fix zkcli alias to use hostname
2021-06-29 16:47:42 -04:00
Joshua Boniface
9855088a8e
Use short ansible_hostname in ipmi fragment
2021-06-29 15:38:19 -04:00
Joshua Boniface
10e8947cb0
Add ipmitool to packages list
2021-06-29 15:30:54 -04:00
Joshua Boniface
53872c0056
Add generic SR-IOV configuration
2021-06-22 03:47:03 -04:00
Joshua Boniface
d88ba7272d
Ensure we can connect to Patroni
2021-06-22 03:28:36 -04:00
Joshua Boniface
84bf1d7efa
Use IPs for Patroni configuration
2021-06-22 03:27:01 -04:00
Joshua Boniface
ae45da3f85
Bump max connections in Zookeeper to 200
2021-06-22 03:15:23 -04:00
Joshua Boniface
c6590f8ab9
Configure Zookeeper only on Cluster address
2021-06-22 03:15:23 -04:00
Joshua Boniface
6396eaa5ff
Ensure libvirtd restarts when unit changes
2021-06-22 03:15:23 -04:00
Joshua Boniface
73bc005c0b
Ensure deb-src is present for bullseye
2021-06-22 03:15:23 -04:00
Joshua Boniface
ec879f4e3c
Add override custom libvirtd.service unit
...
This has no functional change on Buster, but on Bullseye this overrides
the stupid socket-based activation shenanigans that the default unit
tries to do, as well as the breaking replacement of the
/etc/default/libvirt variable names.
2021-06-22 03:15:23 -04:00
Joshua Boniface
b4e9ed5d39
Ensure DEBIAN_FRONTEND is noninteractive
2021-06-22 03:15:23 -04:00
Joshua Boniface
4ccc23bd85
Add python3 version of psycopg2 explicitly
2021-06-22 03:15:23 -04:00
Joshua Boniface
8a140f70dc
Use inventory_hostname for IPMI dict
2021-06-22 03:15:23 -04:00
Joshua Boniface
836c946c72
Use independent fact to work around codename
2021-06-07 10:54:55 -04:00
Joshua Boniface
69c037c136
Ensure backup_keys isn't empty
2021-06-06 00:41:53 -04:00
Joshua Boniface
6b79e5db31
Avoid writing hosts if empty
2021-06-05 01:12:00 -04:00
Joshua Boniface
8fa8590eb8
Ensure apt-update runs if configs update
2021-06-05 01:03:35 -04:00
Joshua Boniface
9dc0949b47
Add bullseye support
2021-06-05 00:56:02 -04:00
Joshua Boniface
998e5a8752
Add directory creation to backup script
2021-06-01 10:16:08 -04:00
Joshua Boniface
0aa328e350
Add PostgreSQL to daily backup script
2021-06-01 10:10:22 -04:00
Joshua Boniface
9deee94332
Update tags and fix backup keys to var
2021-05-27 12:29:19 -04:00
Joshua Boniface
e76832de91
Allow inter-cluster orphan NTP sync
...
Due to the requirement of Ceph to have all peer nodes tightly
synchronized with each other to come online, PVC nodes need a way to
synchronize to each other even in the absence of an external time
reference. This is especially prevalent if a set of nodes are left
offline for an extended period (>1-2 weeks), since their hardware clocks
will drift. If the resulting Internet connectivity is then dependent on
a VM, this will cause a catch-22 and the cluster will not properly
start.
This configuration will accomplish that - if no suitable >6 stratum
peers are found, the hosts will enter orphan mode. Since they are now
all configured as "peers" with each other, they will collectively decide
on one of them to become the source and sync to it. A local stratum 10
fudge is added so that at least one of the nodes can become this source.
While this is not an ideal use of NTP, it is by far the cleanest
solution to this problem, and does not impact normal functionality when
the two configured stratum-2 servers are reachable.
2021-05-19 11:03:18 -04:00
Joshua Boniface
238449904f
Move some other tasks to bootstrap role
...
Avoids an issue where the pvcnoded service is stopped on non-bootstrap
runs.
2021-05-13 10:17:38 -04:00
Joshua Boniface
7536732f30
Remove GRUB config from base role
...
This is not actually ideal.
2021-05-12 14:55:57 -04:00
Joshua Boniface
04bc9730a0
Fix version sorting bugs in kernel-cleanup.sh
2021-05-12 14:40:18 -04:00
Joshua Boniface
45322e0f9e
Add additional items to base role
...
Backups, GRUB configuration, and IPMI configuration.
2021-05-12 13:53:15 -04:00
Joshua Boniface
da9eafcdfa
Fix sudoers to use conditional deploy_username
2021-04-13 16:50:05 -04:00
Joshua Boniface
70ba4b240f
Allow configurable fail2ban IPs
2021-04-13 16:44:49 -04:00
Joshua Boniface
ce3554b530
Allow customization of deploy username
2021-04-13 11:30:42 -04:00
Joshua Boniface
3819cd87fd
Move to more dynamic apt configs
...
Allow specifying repository URLs in the group_vars, and add
release-specific template files to support future version changes.
2021-04-08 14:14:25 -04:00
Joshua Boniface
404751f695
Update relative path to bootstrap files
2021-04-08 14:04:56 -04:00
Joshua Boniface
622cef1586
Remove superfluous symlink
2021-04-08 13:50:47 -04:00
Joshua Boniface
6589a9cd38
Add sensible sorting of kernel removals
2021-04-08 13:46:43 -04:00
Joshua Boniface
6598637e91
Remove cruft and add mkpasswd setup
2021-04-08 13:46:30 -04:00
Joshua Boniface
dcd0b48d94
Correct bad indentation in base role
2021-03-18 09:36:49 -04:00
Joshua Boniface
82fa85834a
Add libguestfs-tools to libvirt role deps
2021-03-15 13:39:37 -04:00
Joshua Boniface
ca3a5e144f
Update tags and add kernel-cleanup script
2021-02-02 15:41:38 -05:00
Joshua Boniface
1c05c8729f
Fix incorrect systemd enabling in Patroni
2021-01-28 16:28:02 -05:00
Joshua Boniface
f4974d648d
Add some additional compression libraries
2021-01-28 13:34:58 -05:00
Joshua Boniface
fa0aeec88e
Add local domain to resolver config
2021-01-28 13:34:26 -05:00
Joshua Boniface
04ca8f73d2
Correct bugs during bootstrap
...
1. Ensure Zookeeper restarts and checks out successfully before
proceeding with other steps.
2. Make sure PVC itself doesn't start prematurely.
2021-01-28 13:32:36 -05:00
Joshua Boniface
b7f251ea16
Retry pgsql bootstrap startup 6 times
...
This will sometimes fail, so retry it several times
2021-01-27 15:45:36 -05:00
Joshua Boniface
7b08610efa
Retry msgr2 enabling 6 times
...
This will sometimes fail, so retry it several times
2021-01-27 14:13:09 -05:00
Joshua Boniface
c4c285c7b3
Remove invalid timezone entries in postgres conf
2021-01-26 15:20:25 -05:00
Joshua Boniface
7585553225
Add default values
2020-12-21 00:20:45 -05:00
Joshua Boniface
ac071f4bf0
Add configurable ZK memory limits
2020-12-21 00:20:45 -05:00