Handle RBD locks on VM/node failure #41

Closed
opened 2019-07-09 01:03:12 -04:00 by JoshuaBoniface · 3 comments
JoshuaBoniface commented 2019-07-09 01:03:12 -04:00 (Migrated from git.bonifacelabs.ca)

If a node fails (is fenced, or otherwise stops VMs uncleanly), RBD locks are left hanging and thus VM startup for those block devices will fail if restarted on another node.

Handle this situation better. Specifically:

  1. On a successful (or, if flagged, unsuccessful) node fence, find the VMs on the host (already being done for migration).
  2. Make sure that RBD locks for each volume tied to each VM are freed.
  3. Migrate (not move; not sure what I'm doing here right now) the VM to another node, ready to be restored.
  4. Set the failed node as "flushed".

Also confirm what happens if a VM fails mid-run (kill -9 of the QEMU process for instance) and whether this situation must be handled as well.

If a node fails (is fenced, or otherwise stops VMs uncleanly), RBD locks are left hanging and thus VM startup for those block devices will fail if restarted on another node. Handle this situation better. Specifically: 1. On a successful (or, if flagged, unsuccessful) node fence, find the VMs on the host (already being done for migration). 2. Make sure that RBD locks for each volume tied to each VM are freed. 3. Migrate (not move; not sure what I'm doing here right now) the VM to another node, ready to be restored. 4. Set the failed node as "flushed". Also confirm what happens if a VM fails mid-run (`kill -9` of the QEMU process for instance) and whether this situation must be handled as well.
JoshuaBoniface commented 2019-07-09 01:04:04 -04:00 (Migrated from git.bonifacelabs.ca)

changed the description

changed the description
JoshuaBoniface commented 2019-07-16 23:04:52 -04:00 (Migrated from git.bonifacelabs.ca)

This has all been implemented. Everything seems to work properly.

This has all been implemented. Everything seems to work properly.
JoshuaBoniface commented 2019-07-16 23:04:53 -04:00 (Migrated from git.bonifacelabs.ca)

closed

closed
Sign in to join this conversation.
No Milestone
No project
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: parallelvirtualcluster/pvc#41
No description provided.