337 Commits

Author SHA1 Message Date
cb50eee2a9 Add OSD removal force option
Ensures a removal can continue even in situations where some step(s)
might fail, for instance removing an obsolete OSD from a replaced node.
2022-04-29 11:16:33 -04:00
313a5d1c7d Bump version to 0.9.47 2021-12-28 22:03:08 -05:00
c3d255be65 Bump version to 0.9.46 2021-12-28 15:02:14 -05:00
45fc8a47a3 Allow single-node clusters to restart and timeout
Prevents a daemon from waiting forever to terminate if it is primary,
and avoids this entirely if there is only a single node in the cluster.
2021-12-28 03:06:03 -05:00
07f2006f68 Fix bug when removing OSDs
Ensure the OSD is down as well as out or purge might fail.
2021-12-28 03:05:34 -05:00
f4c7fdffb8 Handle detect strings as arguments for blockdevs
Allows specifying blockdevs in the OSD and OSD-DB addition commands as
detect strings rather than actual block device paths. This provides
greater flexibility for automation with pvcbootstrapd (which originates
the concept of detect strings) and in general usage as well.
2021-12-28 02:53:02 -05:00
02a2f6a27a Bump version to 0.9.45 2021-11-25 09:34:20 -05:00
3aa20fbaa3 Bump version to 0.9.44 2021-11-11 16:20:38 -05:00
6febcfdd97 Bump version to 0.9.43 2021-11-08 02:29:17 -05:00
16544227eb Reformat recent changes with Black 2021-11-06 03:27:07 -04:00
73e3746885 Fix linting error F541 f-string placeholders 2021-11-06 03:26:03 -04:00
66230ce971 Fix linting errors F522/F523 unused args 2021-11-06 03:24:50 -04:00
2083fd824a Reformat code with Black code formatter
Unify the code style along PEP and Black principles using the tool.
2021-11-06 03:02:43 -04:00
3b02034b70 Add some delay and additional tries to fencing 2021-10-27 16:24:17 -04:00
1d7acf62bf Fix bad location of config sets 2021-10-12 17:23:04 -04:00
c790c331a7 Also validate on failures 2021-10-12 17:11:03 -04:00
23165482df Bump version to 0.9.42 2021-10-12 15:25:42 -04:00
057071a7b7 Go back to passing if exception
Validation already happened and the set happens again later.
2021-10-12 14:21:52 -04:00
554fa9f412 Use current live value for bridge_mtu
This will ensure that upgrading without the bridge_mtu config key set
will keep things as they are.
2021-10-12 12:24:03 -04:00
5a5f924268 Use power off in fence instead of reset
Use a power off (and then make the power on a requirement) during a node
fence. Removes some potential ambiguity in the power state, since we
will know for certain if it is off.
2021-10-12 11:04:27 -04:00
cc309fc021 Validate network MTU after initial read 2021-10-12 10:53:17 -04:00
d1f2ce0b0a Bump version to 0.9.41 2021-10-09 19:39:21 -04:00
12a3a3a6a6 Adjust log type of object setup message 2021-10-09 19:23:12 -04:00
c44732be83 Avoid duplicate runs of MTU set
It wasn't the validator duplicating, but the update duplicating, so
avoid that happening properly this time.
2021-10-09 19:21:47 -04:00
a8b68e0968 Revert "Avoid duplicate runs of MTU validator"
This reverts commit 56021c443a0a992e44350aa960976c6e8bddcb79.
2021-10-09 19:11:42 -04:00
e59152afee Set all log messages to information state
None of these were "success" messages and thus shouldn't have been ok
state.
2021-10-09 19:09:38 -04:00
56021c443a Avoid duplicate runs of MTU validator 2021-10-09 19:07:41 -04:00
ebdea165f1 Use correct isinstance instead of type 2021-10-09 19:03:31 -04:00
fb0651fb05 Move MTU validation to function
Prevents code duplication and ensures validation runs when an MTU is
updated, not just on network creation.
2021-10-09 19:01:45 -04:00
35e7e11403 Add logger message when setting MTU 2021-10-09 18:56:18 -04:00
b7555468eb Ensure vx_mtu is always an int() 2021-10-09 18:52:50 -04:00
4698edc98e Add MTU value checking and log messages
Ensures that if a specified MTU is more than the maximum it is set to
the maximum instead, and adds warning messages for both situations.
2021-10-09 18:48:56 -04:00
b0b0b75605 Have VXNetworkInstance set MTU if unset
Makes this explicit in Zookeeper if a network is unset, post-migration
(schema version 6).

Addresses #144
2021-10-09 17:52:57 -04:00
925141ed65 Fix migration bugs and invalid vx_mtu
Addresses #144
2021-10-09 17:35:10 -04:00
f7a826bf52 Add handlers for client network MTUs
Refactors some of the code in VXNetworkInterface to handle MTUs in a
more streamlined fashion. Also fixes a bug whereby bridge client
networks were being explicitly given the cluster dev MTU which might not
be correct. Now adds support for this option explicitly in the configs,
and defaults to 1500 for safety (the standard Ethernet MTU).

Addresses #144
2021-10-09 17:02:27 -04:00
feab5d3479 Correct flawed conditional in verify_ipmi 2021-10-07 15:11:19 -04:00
ee348593c9 Bump version to 0.9.40 2021-10-07 14:42:04 -04:00
e403146bcf Correct bad stop_keepalive_timer call 2021-10-07 14:41:12 -04:00
e79d200244 Bump version to 0.9.39 2021-10-07 11:52:38 -04:00
3449069e3d Bump version to 0.9.38 2021-10-03 22:32:41 -04:00
19ac1e17c3 Bump version to 0.9.37 2021-09-30 02:08:14 -04:00
3b41759262 Add timeouts to queue gets and adjust
Ensure that all keepalive timeouts are set (prevent the queue.get()
actions from blocking forever) and set the thread timeouts to line up as
well. Everything here is thus limited to keepalive_interval seconds
(default 5s) to keep it uniform.
2021-09-27 16:10:27 -04:00
e514eed414 Re-add success log output during migration 2021-09-27 11:50:55 -04:00
b81e70ec18 Fix missing character in log message 2021-09-27 00:49:43 -04:00
c2a473ed8b Simplify VM migration down to 3 steps
Remove two superfluous synchronization steps which are not needed here,
since the exclusive lock handles that situation anyways.

Still does not fix the weird flush->unflush lock timeout bug, but is
better worked-around now due to the cancelling of the other wait freeing
this up and continuing.
2021-09-27 00:03:20 -04:00
5355f6ff48 Work around synchronization lock issues
Make the block on stage C only wait for 900 seconds (15 minutes) to
prevent indefinite blocking.

The issue comes if a VM is being received, and the current unflush is
cancelled for a flush. When this happens, this lock acquisition seems to
block for no obvious reason, and no other changes seem to affect it.
This is certainly some sort of locking bug within Kazoo but I can't
diagnose it as-is. Leave a TODO to look into this again in the future.
2021-09-26 23:26:21 -04:00
bf7823deb5 Improve log messages during VM migration 2021-09-26 23:15:38 -04:00
8ba371723e Use event to non-block wait and fix inf wait 2021-09-26 22:55:39 -04:00
e10ac52116 Track status of VM state thread 2021-09-26 22:55:21 -04:00
341073521b Simplify locking process for VM migration
Rather than using a cumbersome and overly complex ping-pong of read and
write locks, instead move to a much simpler process using exclusive
locks.

Describing the process in ASCII or narrative is cumbersome, but the
process ping-pongs via a set of exclusive locks and wait timers, so that
the two sides are able to synchronize via blocking the exclusive lock.
The end result is a much more streamlined migration (takes about half
the time all things considered) which should be less error-prone.
2021-09-26 22:08:07 -04:00