65 Commits

Author SHA1 Message Date
2928d695c9 Ensure migration method is updated on state changes 2021-06-22 03:20:15 -04:00
60e1da09dd Don't try any shenannegans when updating NICs
Trying to do this on the VMInstance side had problems because we can't
differentiate the 3 types of migration there. So, just update this in
the API side and hope everything goes well.

This introduces an edge bug: if a VM is using a macvtap SR-IOV device,
and then tries to migrate, and the migrate is aborted, the NIC lists
will be inconsistent.

When I revamp the VMInstance in the future, I should be able to correct
this, but for now we'll have to live with that edgecase.
2021-06-22 00:00:50 -04:00
7d42fba373 Ensure being in migrate doesn't abort shutdown 2021-06-21 23:28:53 -04:00
24ce361a04 Ensure SR-IOV NIC states are updated on migration 2021-06-21 23:18:34 -04:00
e4a65230a1 Just do the shutdown command itself 2021-06-15 02:32:14 -04:00
284c581845 Ensure shutdown migrations actually time out
Without this a VM that fails to respond to a shutdown will just spin
forever, blocking state changes.
2021-06-15 00:23:15 -04:00
26b1f531e9 Fix bad variable interpolation 2021-06-13 14:37:23 -04:00
85aba7cc18 Convert VMInstance to new ZK schema handler 2021-06-09 23:15:08 -04:00
cd75413667 Increase initial lock timer
With the new library the reader seems to be a little too quick, so hold
the write lock for 1 second instead of 1/2 second to ensure it is
caught.
2021-06-01 17:00:11 -04:00
9764090d6d Merge node common with daemon common 2021-06-01 12:22:11 -04:00
790098f181 Convert VMInstance to new zkhandler 2021-06-01 11:46:27 -04:00
0bf276fd51 Update copyright year in headers 2021-03-25 17:01:55 -04:00
f4ec161aa2 Update file copyright header.
Remove the option to select a later version of the GPL.
2021-03-25 16:58:02 -04:00
1b6613c280 Add live VNC information to domain output
Sets in the node daemon, returns via the API, and shows in the CLI,
information about the live VNC listen address and port for VNC-enabled
VMs.

Closes #115
2020-12-20 16:00:55 -05:00
3705daff43 Better handle failing RBD lock frees
If the VM is not in a stop state, failing to free the lock is now
considered a fatal error and will put the domain into fail state,
aborting the start. This is better than being unsafe or trying to start
a VM which will fail to boot due to read-only volumes.
2020-12-14 16:04:38 -05:00
7c99a7bda7 Safely reset RBD locks on failed VMs
Should correct issues on cold start as well as if a VM crashes
uncleanly, which would prevent the VM from starting due to stale RBD
locks.

This implementation has four parts:
  1. Update how IP addresses are handled, specifically by replacing all
  previous instances of "vni_ipaddr" with "vni_floatingipaddr", and then
  adding the "vni_ipaddr" with the real data for this node's IPs. Also
  include the storage IPs in this where they weren't before, so each
  this_node actually has the local IPs plus floating IPs. This enables
  the next two steps.
  2. Modify flush_locks to take this_node as an argument, and update the
  run_command function to only operate against this node, rather than on
  the primary coordinator.
  3. Have the flush_locks check each lock against the current node, to
  verify that the lock is actually held by the current node. This is the
  only way to do this safely. During fencing, we override this by not
  passing a this_node which bypasses this check.
  4. Have the VM start do the check for VM failure/startup and execute a
  flush_locks before actually starting the VM.
2020-12-14 15:53:18 -05:00
70dfcd434f Ensure inmigrate is cleared on failure 2020-11-17 12:57:37 -05:00
260b39ebf2 Lint: E302 expected 2 blank lines, found X 2020-11-07 14:45:24 -05:00
ab0b932fe3 Lint: E125 continuation line with same indent as next logical line 2020-11-07 13:49:54 -05:00
e553c5d42a Lint: E122 continuation line missing indentation or outdented 2020-11-07 13:12:26 -05:00
7932be3948 Lint: E261 at least two spaces before inline comment 2020-11-07 13:11:03 -05:00
3f242cd437 Lint: E202 whitespace before '}' 2020-11-07 12:57:42 -05:00
e333f2b935 Lint: E201 whitespace after '{' 2020-11-07 12:38:31 -05:00
4b47a2424c Lint: E303 too many blank lines (2) 2020-11-06 21:16:52 -05:00
5da314902f Lint: F841 local variable '<variable>' is assigned to but never used 2020-11-06 21:13:13 -05:00
aecb845d6a Lint: E713 test for membership should be 'not in' 2020-11-06 20:37:52 -05:00
57c51d3234 Lint: E711 comparison to None should be 'if cond is not None:' 2020-11-06 19:37:13 -05:00
ce01b41d81 Lint: E711 comparison to None should be 'if cond is None:' 2020-11-06 19:36:36 -05:00
4d6f36aca0 Lint: E712 comparison to False should be 'if cond is False:' or 'if not cond:' 2020-11-06 19:35:51 -05:00
d9e7b7ec15 Lint: F401 <library> imported but unused 2020-11-06 19:22:49 -05:00
63f4f9aed7 Lint: E722 do not use bare 'except' 2020-11-06 18:55:10 -05:00
ec0b8acf90 Support per-VM migration type selectors
Allow a VM to specify its migration type as a default choice. The valid
options are "default" (i.e. behave as now), "live" which forces a live
migration only, and "shutdown" which forces a shutdown migration only.
The new option is treated as a VM meta option and is set to default if
not found.
2020-10-29 12:01:29 -04:00
890023cbfc Make sender wait dynamic based on receiver 2020-10-21 14:43:54 -04:00
28abb018e3 Improve some timeouts and conditionals 2020-10-21 12:00:10 -04:00
017953c2e6 Move lock release to phase D 2020-10-21 11:07:01 -04:00
82b4d3ed1b Add missing prefix statements to loggers 2020-10-21 10:52:53 -04:00
bae366a316 Add waits and only receive check on send 2020-10-21 10:43:42 -04:00
351076c15e Check if node changed during final check
Avoids situations where two migrates, to different nodes, happen in
rapid succession. Aborts the migration if the current target node no
longer matches what was set at the start of the execution.
2020-10-21 02:52:36 -04:00
42514b9a50 Improve messages further 2020-10-21 02:41:42 -04:00
611e47f338 Add messages to migration aborts
Results in some information duplication, but ensures logging of the
reason a migration was aborted separate from the error(s) this may
generate.
2020-10-21 02:38:42 -04:00
1523959074 Move where setting last_ vars happens 2020-10-21 02:24:00 -04:00
ef762359f4 Adjust timing to avoid migrating to self quickly
Add another separate state lock, release it earlier, and ensure timings
are good to avoid double-migrating one VM.
2020-10-21 02:17:55 -04:00
398d33778f Avoid stopping duplicates, just lock our own key 2020-10-20 16:10:39 -04:00
a6d492ed9f Remove spurious writes and adjust sleep 2020-10-20 16:04:26 -04:00
11fa3b0df3 Remove additional wait and add last_node entries
These allow for aborting a migration to retain the previous settings and
override what the client set.
2020-10-20 15:58:55 -04:00
442aa4e420 Tweak timers further 2020-10-20 15:43:59 -04:00
3910843660 Add missing break 2020-10-20 15:39:29 -04:00
70f3fdbfb9 Tweak the delays slightly on receive 2020-10-20 15:38:07 -04:00
7cb0241a12 Attempt live migrates 3 times before proceeding 2020-10-20 15:33:41 -04:00
9fb33ed7a7 Increase peer lock acquiring timers 2020-10-20 15:26:59 -04:00