Joshua Boniface
e3c1d28674
Add upgrade to Debian 12 playbook
2023-09-01 15:42:29 -04:00
Joshua Boniface
7e829f04ae
Restore unknown state as not-reboot
2023-09-01 15:42:29 -04:00
Joshua Boniface
2c63500011
Split upgrade stage and add dpkg cleanup
...
Avoid problems if one or more nodes are upgrading libvirt/QEMU and live
migrations fail.
2023-09-01 15:42:29 -04:00
Joshua Boniface
7a0c596281
Add node daemon confirmation before continue
2023-09-01 15:42:29 -04:00
Joshua Boniface
3d4e66471e
Trigger restart even with rc=3
2023-09-01 15:42:29 -04:00
Joshua Boniface
cbae685b45
Ignore needrestart unknown case
2023-09-01 15:42:29 -04:00
Joshua Boniface
7cac7b26ce
Ensure freshness check is proper
2023-09-01 15:42:28 -04:00
Joshua Boniface
be091f66d4
Remove pvc-flush references
...
This service causes more problems than it solves usually, so it is being
removed in the next PVC version.
2023-09-01 15:42:28 -04:00
Joshua Boniface
5aeca53212
Add PostgreSQL cleanup to upgrade
2023-09-01 15:42:28 -04:00
Joshua Boniface
d8cad85e91
Add oneshot playbook to reboot cluster
2023-09-01 15:42:28 -04:00
Joshua Boniface
9e20e47903
Update freshness checks
2023-09-01 15:42:28 -04:00
Joshua Boniface
5de3ab0c3a
Move pvc maintenance to separate plays
...
This ensures that the maintenance on/off happens before all tasks and
after all tasks and not intermittently.
2023-09-01 15:42:28 -04:00
Joshua Boniface
94e9bf9133
Ignore errors during flush commands
...
These might inexplicably fail, but that is fine.
2023-09-01 15:42:27 -04:00
Joshua Boniface
197000a48d
Revert "Add symlink for Ceph file pickup"
...
This reverts commit 3ac946bf2e
.
2023-09-01 15:42:27 -04:00
Joshua Boniface
94c019f960
Add symlink for Ceph file pickup
2023-09-01 15:42:27 -04:00
Joshua Boniface
3161708593
Include another upgrade in deb11 playbook
...
Ensures that the system is fully updated after re-enabling the security
repository during the base run.
2023-09-01 15:42:27 -04:00
Joshua Boniface
593599c148
Add Debian 10 -> Debian 11 upgrade playbook
2023-09-01 15:42:27 -04:00
Joshua Boniface
ec2fd99eb6
Avoid errors if noout fails
2023-09-01 15:42:27 -04:00
Joshua Boniface
b9f00e3faf
Increase flush/unflush wait timeout
...
Bump this from 10 minutes (60 * 10 seconds) to 30 minutes (180 * 10
seconds) to ensure there is sufficient time for (relatively) large VMs
to migrate with (relatively) slow networking.
2023-09-01 15:42:26 -04:00
Joshua Boniface
4fe6204dfb
Use wait on secondary and delay for 15 seconds
2023-09-01 15:42:26 -04:00
Joshua Boniface
43d4f69608
Rename Daemon upgrade playbook to match
2023-09-01 15:42:26 -04:00
Joshua Boniface
e55f465034
Reduce timeouts in upgrade playbook
2023-09-01 15:42:26 -04:00
Joshua Boniface
822e39b325
Fix name to be more clear
2023-09-01 15:42:26 -04:00
Joshua Boniface
2d9a5a9d31
Adjust ordering of flush task
2023-09-01 15:42:26 -04:00
Joshua Boniface
1cfbc25f37
Add norestart policy for apt updates
2023-09-01 15:42:25 -04:00
Joshua Boniface
ccc6489512
Add README and daemon upgrade playbook, cleanups
2023-09-01 15:42:25 -04:00
Joshua Boniface
f34b2a5f7e
Add cleanup to update oneshot playbook
2023-09-01 15:42:25 -04:00
Joshua Boniface
0ddd11844e
Reorder Ceph stop and lower some waits
2023-09-01 15:42:25 -04:00
Joshua Boniface
cf609eb609
Add tasks to verify node has finished (un)flushing
2023-09-01 15:42:25 -04:00
Joshua Boniface
c29cdd5305
Increase all wait timeouts to 30s
...
Ensure that even on slow(er) clusters, these timeouts have more time to
complete before proceeding so the task won't fail.
2023-09-01 15:42:24 -04:00
Joshua Boniface
750cb4b55c
Disable pvc-flush service while rebooting
...
Prevents the flush daemon from starting on node boot, before the
playbook is actually ready to unflush the node.
2023-09-01 15:42:24 -04:00
Joshua Boniface
cdc7e3377b
Tweak oneshot script
...
Cleanly stop daemons; check if OSDs are back before continuing; wait
less
2023-09-01 15:42:24 -04:00
Joshua Boniface
9962ceaf0a
Add cluster safe update playbook
...
This playbook will perform a oneshot upgrade of the systems in the
cluster, including performing a clean and safe reboot of the node(s) if
required (either due to services needing a restart, or the kernel
changing). It runs in serial=1 and only reboots if needed.
2023-09-01 15:42:24 -04:00