Joshua Boniface
2a1e76f479
Split upgrade stage and add dpkg cleanup
...
Avoid problems if one or more nodes are upgrading libvirt/QEMU and live
migrations fail.
2023-02-23 15:42:19 -05:00
Joshua Boniface
60092d363a
Add node daemon confirmation before continue
2023-02-23 13:54:22 -05:00
edc76114c6
Trigger restart even with rc=3
2023-02-22 18:54:07 -05:00
Joshua Boniface
3ee4e7cd3f
Ignore needrestart unknown case
2023-01-16 17:19:34 -05:00
87e7449eca
Ensure freshness check is proper
2022-09-02 10:05:19 -04:00
503a2e6c0b
Remove pvc-flush references
...
This service causes more problems than it solves usually, so it is being
removed in the next PVC version.
2022-07-25 23:19:38 -04:00
1a7969b707
Update freshness checks
2022-05-31 22:27:30 -04:00
Joshua Boniface
70c7c76605
Move pvc maintenance to separate plays
...
This ensures that the maintenance on/off happens before all tasks and
after all tasks and not intermittently.
2021-11-11 15:54:22 -05:00
Joshua Boniface
820e2a64d0
Ignore errors during flush commands
...
These might inexplicably fail, but that is fine.
2021-10-13 10:34:36 -04:00
74066e6ceb
Avoid errors if noout fails
2021-10-07 16:31:52 -04:00
311f388f56
Increase flush/unflush wait timeout
...
Bump this from 10 minutes (60 * 10 seconds) to 30 minutes (180 * 10
seconds) to ensure there is sufficient time for (relatively) large VMs
to migrate with (relatively) slow networking.
2021-07-22 16:16:27 -04:00
942743daef
Use wait on secondary and delay for 15 seconds
2021-07-22 09:35:00 -04:00
bb094193b4
Adjust ordering of flush task
2021-07-06 09:28:59 -04:00
cae8cfc4cb
Add norestart policy for apt updates
2021-05-27 01:38:43 -04:00
491ea77306
Add README and daemon upgrade playbook, cleanups
2021-05-20 11:02:47 -04:00
510db0df58
Add cleanup to update oneshot playbook
2021-02-02 15:41:38 -05:00
97869ca5c3
Reorder Ceph stop and lower some waits
2021-01-07 11:11:16 -05:00
d35250b870
Add tasks to verify node has finished (un)flushing
2021-01-07 10:49:23 -05:00
cd164d1984
Increase all wait timeouts to 30s
...
Ensure that even on slow(er) clusters, these timeouts have more time to
complete before proceeding so the task won't fail.
2021-01-05 16:17:19 -05:00
f277acc974
Disable pvc-flush service while rebooting
...
Prevents the flush daemon from starting on node boot, before the
playbook is actually ready to unflush the node.
2020-12-15 14:32:50 -05:00
8b474760ed
Tweak oneshot script
...
Cleanly stop daemons; check if OSDs are back before continuing; wait
less
2020-11-26 10:51:54 -05:00
b4ba4f9eda
Add cluster safe update playbook
...
This playbook will perform a oneshot upgrade of the systems in the
cluster, including performing a clean and safe reboot of the node(s) if
required (either due to services needing a restart, or the kernel
changing). It runs in serial=1 and only reboots if needed.
2020-10-27 15:41:20 -04:00