Joshua Boniface
2a1e76f479
Split upgrade stage and add dpkg cleanup
...
Avoid problems if one or more nodes are upgrading libvirt/QEMU and live
migrations fail.
2023-02-23 15:42:19 -05:00
Joshua Boniface
60092d363a
Add node daemon confirmation before continue
2023-02-23 13:54:22 -05:00
Joshua Boniface
edc76114c6
Trigger restart even with rc=3
2023-02-22 18:54:07 -05:00
Joshua Boniface
3ee4e7cd3f
Ignore needrestart unknown case
2023-01-16 17:19:34 -05:00
Joshua Boniface
87e7449eca
Ensure freshness check is proper
2022-09-02 10:05:19 -04:00
Joshua Boniface
503a2e6c0b
Remove pvc-flush references
...
This service causes more problems than it solves usually, so it is being
removed in the next PVC version.
2022-07-25 23:19:38 -04:00
Joshua Boniface
1a7969b707
Update freshness checks
2022-05-31 22:27:30 -04:00
Joshua Boniface
70c7c76605
Move pvc maintenance to separate plays
...
This ensures that the maintenance on/off happens before all tasks and
after all tasks and not intermittently.
2021-11-11 15:54:22 -05:00
Joshua Boniface
820e2a64d0
Ignore errors during flush commands
...
These might inexplicably fail, but that is fine.
2021-10-13 10:34:36 -04:00
Joshua Boniface
74066e6ceb
Avoid errors if noout fails
2021-10-07 16:31:52 -04:00
Joshua Boniface
311f388f56
Increase flush/unflush wait timeout
...
Bump this from 10 minutes (60 * 10 seconds) to 30 minutes (180 * 10
seconds) to ensure there is sufficient time for (relatively) large VMs
to migrate with (relatively) slow networking.
2021-07-22 16:16:27 -04:00
Joshua Boniface
942743daef
Use wait on secondary and delay for 15 seconds
2021-07-22 09:35:00 -04:00
Joshua Boniface
bb094193b4
Adjust ordering of flush task
2021-07-06 09:28:59 -04:00
Joshua Boniface
cae8cfc4cb
Add norestart policy for apt updates
2021-05-27 01:38:43 -04:00
Joshua Boniface
491ea77306
Add README and daemon upgrade playbook, cleanups
2021-05-20 11:02:47 -04:00
Joshua Boniface
510db0df58
Add cleanup to update oneshot playbook
2021-02-02 15:41:38 -05:00
Joshua Boniface
97869ca5c3
Reorder Ceph stop and lower some waits
2021-01-07 11:11:16 -05:00
Joshua Boniface
d35250b870
Add tasks to verify node has finished (un)flushing
2021-01-07 10:49:23 -05:00
Joshua Boniface
cd164d1984
Increase all wait timeouts to 30s
...
Ensure that even on slow(er) clusters, these timeouts have more time to
complete before proceeding so the task won't fail.
2021-01-05 16:17:19 -05:00
Joshua Boniface
f277acc974
Disable pvc-flush service while rebooting
...
Prevents the flush daemon from starting on node boot, before the
playbook is actually ready to unflush the node.
2020-12-15 14:32:50 -05:00
Joshua Boniface
8b474760ed
Tweak oneshot script
...
Cleanly stop daemons; check if OSDs are back before continuing; wait
less
2020-11-26 10:51:54 -05:00
Joshua Boniface
b4ba4f9eda
Add cluster safe update playbook
...
This playbook will perform a oneshot upgrade of the systems in the
cluster, including performing a clean and safe reboot of the node(s) if
required (either due to services needing a restart, or the kernel
changing). It runs in serial=1 and only reboots if needed.
2020-10-27 15:41:20 -04:00