Commit Graph

629 Commits

Author SHA1 Message Date
Joshua Boniface c3d255be65 Bump version to 0.9.46 2021-12-28 15:02:14 -05:00
Joshua Boniface 45fc8a47a3 Allow single-node clusters to restart and timeout
Prevents a daemon from waiting forever to terminate if it is primary,
and avoids this entirely if there is only a single node in the cluster.
2021-12-28 03:06:03 -05:00
Joshua Boniface 07f2006f68 Fix bug when removing OSDs
Ensure the OSD is down as well as out or purge might fail.
2021-12-28 03:05:34 -05:00
Joshua Boniface f4c7fdffb8 Handle detect strings as arguments for blockdevs
Allows specifying blockdevs in the OSD and OSD-DB addition commands as
detect strings rather than actual block device paths. This provides
greater flexibility for automation with pvcbootstrapd (which originates
the concept of detect strings) and in general usage as well.
2021-12-28 02:53:02 -05:00
Joshua Boniface 02a2f6a27a Bump version to 0.9.45 2021-11-25 09:34:20 -05:00
Joshua Boniface 658e80350f Fix ordering of pvcnoded unit
We want to be after network.target and want network-online.target
2021-11-18 16:56:49 -05:00
Joshua Boniface 3aa20fbaa3 Bump version to 0.9.44 2021-11-11 16:20:38 -05:00
Joshua Boniface 6d101df1ff Add Munin plugin for Ceph utilization 2021-11-08 15:21:09 -05:00
Joshua Boniface 6febcfdd97 Bump version to 0.9.43 2021-11-08 02:29:17 -05:00
Joshua Boniface 16544227eb Reformat recent changes with Black 2021-11-06 03:27:07 -04:00
Joshua Boniface 73e3746885 Fix linting error F541 f-string placeholders 2021-11-06 03:26:03 -04:00
Joshua Boniface 66230ce971 Fix linting errors F522/F523 unused args 2021-11-06 03:24:50 -04:00
Joshua Boniface 2083fd824a Reformat code with Black code formatter
Unify the code style along PEP and Black principles using the tool.
2021-11-06 03:02:43 -04:00
Joshua Boniface 3b02034b70 Add some delay and additional tries to fencing 2021-10-27 16:24:17 -04:00
Joshua Boniface 1d7acf62bf Fix bad location of config sets 2021-10-12 17:23:04 -04:00
Joshua Boniface c790c331a7 Also validate on failures 2021-10-12 17:11:03 -04:00
Joshua Boniface 23165482df Bump version to 0.9.42 2021-10-12 15:25:42 -04:00
Joshua Boniface 057071a7b7 Go back to passing if exception
Validation already happened and the set happens again later.
2021-10-12 14:21:52 -04:00
Joshua Boniface 554fa9f412 Use current live value for bridge_mtu
This will ensure that upgrading without the bridge_mtu config key set
will keep things as they are.
2021-10-12 12:24:03 -04:00
Joshua Boniface 5a5f924268 Use power off in fence instead of reset
Use a power off (and then make the power on a requirement) during a node
fence. Removes some potential ambiguity in the power state, since we
will know for certain if it is off.
2021-10-12 11:04:27 -04:00
Joshua Boniface cc309fc021 Validate network MTU after initial read 2021-10-12 10:53:17 -04:00
Joshua Boniface d1f2ce0b0a Bump version to 0.9.41 2021-10-09 19:39:21 -04:00
Joshua Boniface 12a3a3a6a6 Adjust log type of object setup message 2021-10-09 19:23:12 -04:00
Joshua Boniface c44732be83 Avoid duplicate runs of MTU set
It wasn't the validator duplicating, but the update duplicating, so
avoid that happening properly this time.
2021-10-09 19:21:47 -04:00
Joshua Boniface a8b68e0968 Revert "Avoid duplicate runs of MTU validator"
This reverts commit 56021c443a.
2021-10-09 19:11:42 -04:00
Joshua Boniface e59152afee Set all log messages to information state
None of these were "success" messages and thus shouldn't have been ok
state.
2021-10-09 19:09:38 -04:00
Joshua Boniface 56021c443a Avoid duplicate runs of MTU validator 2021-10-09 19:07:41 -04:00
Joshua Boniface ebdea165f1 Use correct isinstance instead of type 2021-10-09 19:03:31 -04:00
Joshua Boniface fb0651fb05 Move MTU validation to function
Prevents code duplication and ensures validation runs when an MTU is
updated, not just on network creation.
2021-10-09 19:01:45 -04:00
Joshua Boniface 35e7e11403 Add logger message when setting MTU 2021-10-09 18:56:18 -04:00
Joshua Boniface b7555468eb Ensure vx_mtu is always an int() 2021-10-09 18:52:50 -04:00
Joshua Boniface 4698edc98e Add MTU value checking and log messages
Ensures that if a specified MTU is more than the maximum it is set to
the maximum instead, and adds warning messages for both situations.
2021-10-09 18:48:56 -04:00
Joshua Boniface b0b0b75605 Have VXNetworkInstance set MTU if unset
Makes this explicit in Zookeeper if a network is unset, post-migration
(schema version 6).

Addresses #144
2021-10-09 17:52:57 -04:00
Joshua Boniface 925141ed65 Fix migration bugs and invalid vx_mtu
Addresses #144
2021-10-09 17:35:10 -04:00
Joshua Boniface f7a826bf52 Add handlers for client network MTUs
Refactors some of the code in VXNetworkInterface to handle MTUs in a
more streamlined fashion. Also fixes a bug whereby bridge client
networks were being explicitly given the cluster dev MTU which might not
be correct. Now adds support for this option explicitly in the configs,
and defaults to 1500 for safety (the standard Ethernet MTU).

Addresses #144
2021-10-09 17:02:27 -04:00
Joshua Boniface feab5d3479 Correct flawed conditional in verify_ipmi 2021-10-07 15:11:19 -04:00
Joshua Boniface ee348593c9 Bump version to 0.9.40 2021-10-07 14:42:04 -04:00
Joshua Boniface e403146bcf Correct bad stop_keepalive_timer call 2021-10-07 14:41:12 -04:00
Joshua Boniface e79d200244 Bump version to 0.9.39 2021-10-07 11:52:38 -04:00
Joshua Boniface 3449069e3d Bump version to 0.9.38 2021-10-03 22:32:41 -04:00
Joshua Boniface 19ac1e17c3 Bump version to 0.9.37 2021-09-30 02:08:14 -04:00
Joshua Boniface 3b41759262 Add timeouts to queue gets and adjust
Ensure that all keepalive timeouts are set (prevent the queue.get()
actions from blocking forever) and set the thread timeouts to line up as
well. Everything here is thus limited to keepalive_interval seconds
(default 5s) to keep it uniform.
2021-09-27 16:10:27 -04:00
Joshua Boniface e514eed414 Re-add success log output during migration 2021-09-27 11:50:55 -04:00
Joshua Boniface b81e70ec18 Fix missing character in log message 2021-09-27 00:49:43 -04:00
Joshua Boniface c2a473ed8b Simplify VM migration down to 3 steps
Remove two superfluous synchronization steps which are not needed here,
since the exclusive lock handles that situation anyways.

Still does not fix the weird flush->unflush lock timeout bug, but is
better worked-around now due to the cancelling of the other wait freeing
this up and continuing.
2021-09-27 00:03:20 -04:00
Joshua Boniface 5355f6ff48 Work around synchronization lock issues
Make the block on stage C only wait for 900 seconds (15 minutes) to
prevent indefinite blocking.

The issue comes if a VM is being received, and the current unflush is
cancelled for a flush. When this happens, this lock acquisition seems to
block for no obvious reason, and no other changes seem to affect it.
This is certainly some sort of locking bug within Kazoo but I can't
diagnose it as-is. Leave a TODO to look into this again in the future.
2021-09-26 23:26:21 -04:00
Joshua Boniface bf7823deb5 Improve log messages during VM migration 2021-09-26 23:15:38 -04:00
Joshua Boniface 8ba371723e Use event to non-block wait and fix inf wait 2021-09-26 22:55:39 -04:00
Joshua Boniface e10ac52116 Track status of VM state thread 2021-09-26 22:55:21 -04:00
Joshua Boniface 341073521b Simplify locking process for VM migration
Rather than using a cumbersome and overly complex ping-pong of read and
write locks, instead move to a much simpler process using exclusive
locks.

Describing the process in ASCII or narrative is cumbersome, but the
process ping-pongs via a set of exclusive locks and wait timers, so that
the two sides are able to synchronize via blocking the exclusive lock.
The end result is a much more streamlined migration (takes about half
the time all things considered) which should be less error-prone.
2021-09-26 22:08:07 -04:00