Commit Graph

295 Commits

Author SHA1 Message Date
Joshua Boniface b413e042a6 Improve handling of primary contention
Previously, contention could occasionally cause a flap/dual primary
contention state due to the lack of checking within this function. This
could cause a state where a node transitions to primary than is almost
immediately shifted away, which could cause undefined behaviour in the
cluster.

The solution includes several elements:
    * Implement an exclusive lock operation in zkhandler
    * Switch the become_primary function to use this exclusive lock
    * Implement exclusive locking during the contention process
    * As a failsafe, check stat versions before setting the node as the
      primary node, in case another node already has
    * Delay the start of takeover/relinquish operations by slightly
      longer than the lock timeout
    * Make the current router_state conditions more explicit (positive
      conditionals rather than negative conditionals)

The new scenario ensures that during contention, only one secondary will
ever succeed at acquiring the lock. Ideally, the other would then grab
the lock and pass, but in testing this does not seem to be the case -
the lock always times out, so the failsafe check is technically not
needed but has been left as an added safety mechanism. With this setup,
the node that fails the contention will never block the switchover nor
will it try to force itself onto the cluster after another node has
successfully won contention.

Timeouts may need to be adjusted in the future, but the base timeout of
0.4 seconds (and transition delay of 0.5 seconds) seems to work reliably
during preliminary tests.
2020-04-12 03:40:17 -04:00
Joshua Boniface e672d799a6 Set flush after pvcapid.service
This may or may not help, but should in theory prevent the flush from
trying to run after a (locally-running) API daemon is terminated, which
could cause an API failure and a failure to flush.
2020-04-12 01:48:50 -04:00
Joshua Boniface a130f19a19 Depend pvcnoded on Zookeeper (harder) and libvirtd 2020-04-09 09:57:53 -04:00
Joshua Boniface a671d9d457 Use consistent tense in messages 2020-04-08 22:00:51 -04:00
Joshua Boniface fee1c7dd6c Reorder cleanup and gracefully wait for flushes 2020-04-08 22:00:08 -04:00
Joshua Boniface 5d58bee34f Add some time around noded startup/shutdown
Otherwise, systemd kills networking before the node daemon fully stops
and it goes into "dead" status, which is super annoying.
2020-04-01 23:59:14 -04:00
Joshua Boniface f668412941 Don't use Requires as the dep is too hard
Requires seems to flush on every service restart which is NOT what we
want. Use Wants instead.
2020-04-01 15:15:37 -04:00
Joshua Boniface a0ebc0d3a7 Add more robust requirements to pvc-flush service 2020-04-01 15:09:44 -04:00
Joshua Boniface 98a7005c1b Add significant TimeoutSec to pvc-flush service
This will stop systemd from killing the service in the middle of a flush
or unflush operation, which completely defeats the purpose. 30 minutes
was chosen as this is a very large but still somewhat manageable value,
which should cover even a very large very loaded cluster with room to
spare.
2020-04-01 01:24:09 -04:00
Joshua Boniface 0a367898a0 Don't trigger aggregator fail if fine 2020-03-12 13:22:12 -04:00
Joshua Boniface c02bc0b46a Correct issues with VM lock freeing
Code was bad and using a depricated feature.
2020-03-02 12:45:12 -05:00
Joshua Boniface 1e4350ca6f Properly handle takeover state in VXNetworks
Most of these actions/conditionals were looking for primary state, but
were failing during node takeover. Update the conditionals to look for
both router states instead.

Also add a wait to lock flushing until a takeover is completed.
2020-03-02 10:41:00 -05:00
Joshua Boniface 57768f2583 Remove an obsolete script 2020-02-19 21:40:23 -05:00
Joshua Boniface e4e4e336b4 Handle invalid cursor setup cleanly
This seems to happen only during termination, so catch it and continue
so the loop terminates.
2020-02-19 16:29:59 -05:00
Joshua Boniface d2a5fe59c0 Use transitional takeover states for migration
Use a pair of transitional states, "takeover" and "relinquish", when
transitioning between primary and secondary coordinator states. This
provides a clsuter-wide record that the nodes are still working during
their synchronous transition states, and should allow clients to
determine when the node(s) have fully switched over. Also add an
additional 2 seconds of wait at the end of the transition jobs to ensure
everything has had a chance to start before proceeding.

References #72
2020-02-19 14:06:54 -05:00
Joshua Boniface 9c7041f12c Update package version to 0.7 2020-02-15 23:25:47 -05:00
Joshua Boniface 7ace5b5056 Remove /ceph/cmd pipe for (most) Ceph commands
Addresses #80
2020-02-08 23:40:02 -05:00
Joshua Boniface 37310e5455 Correct name of systemd target 2020-02-08 20:39:07 -05:00
Joshua Boniface ce985234c3 Use consistent naming of components
Rename "pvcd" to "pvcnoded", and "pvc-api" to "pvcapid" so names for the
daemons are fully consistent. Update the names of the configuration
files as well to match this new formatting.

References #79
2020-02-08 19:34:07 -05:00
Joshua Boniface 4505b239eb Rename API and common Debian packages
Closes #79
2020-02-08 18:50:38 -05:00
Joshua Boniface 74228eb063 Bump version to 0.6 2020-02-08 18:27:39 -05:00
Joshua Boniface 90e42683c6 Reduce sleep time during VM migrations 2020-02-04 17:52:37 -05:00
Joshua Boniface 20c8466296 Handle invalid search fields better 2020-02-04 17:35:24 -05:00
Joshua Boniface ab28bf40d1 Change ordering of services during primary switch
Fixes #77
2020-01-30 09:18:56 -05:00
Joshua Boniface 5d73974e95 Fix several bugs around load-based migrations 2020-01-29 17:35:10 -05:00
Joshua Boniface 0b31bab797 Add more helpful config parse error message 2020-01-22 12:09:31 -05:00
Joshua Boniface 4c1b78d7a4 Use dictionary get() to prevent crashes
Use the get() function throughout to prevent crashes in various
scenarios if the profile data isn't present or consistent.
2020-01-13 09:21:57 -05:00
Joshua Boniface 4ad29f669d Update default configuration samples 2020-01-12 21:33:15 -05:00
Joshua Boniface 0d2e22a111 Normalize all static networks with bridges
Modifies the storage and upstream networks to mirror the cluster
network, with a bridge on top of the underlying specified dev, and all
IPs bound to the bridge.

Allows creating VMs in the storage or upstream networks, as well as the
cluster network, should the administrator choose to do so (manually).
2020-01-12 19:04:31 -05:00
Joshua Boniface 1671a87dd4 Fix the flush service 2020-01-11 17:04:12 -05:00
Joshua Boniface b6474198a4 Implement cluster maintenance mode
Implements a "maintenance mode" for PVC clusters. For now, the only
thing this mode does is disable node fencing while the state is true.
This allows the administrator to tell PVC that network connectivity,
etc. might be interrupted and to avoid fencing nodes.

Closes #70
2020-01-09 10:53:27 -05:00
Joshua Boniface 4e5bce4975 Update copyright header year to 2020 2020-01-08 19:38:02 -05:00
Joshua Boniface c515d63340 Add provision state for VMs 2020-01-08 17:40:02 -05:00
Joshua Boniface 21d87f5e51 Add v6 configurations to dnsmasq
These options were only applied with v4 networks; now, use the v6
address in a dual-stack or v6-only network.
2020-01-06 23:48:04 -05:00
Joshua Boniface f326fd99e2 Properly fix IPv4 no-DHCP networking 2020-01-06 22:31:37 -05:00
Joshua Boniface 38dae8b32f Change name of cluster in patronictl command 2020-01-06 16:37:17 -05:00
Joshua Boniface 2d2bdb879e Use get() instead of direct dict reference 2020-01-06 16:34:39 -05:00
Joshua Boniface 30d4470c8f Only print AXFR errors in debug mode 2020-01-06 16:04:37 -05:00
Joshua Boniface bbfadac5e1 Fix dnsmasq options for DHCP-disabled networks 2020-01-06 16:04:26 -05:00
Joshua Boniface 7b3e267f7a Implement bridge_device for bridged VNIs
Required due to #64. Bridged networks were being created on top of a
vLAN if the Cluster network was a vLAN device, rather than being created
on the underlying device. This came from a previous revision of the
cluster architecture guidelines where Cluster was supposed to be a raw
device rather than a vLAN. This fixed the problem by implementing a
configuration field for a "bridge_device", a NIC device that can then
have the bridged vLANs created on top of it.

Fixes #64
2020-01-06 14:44:56 -05:00
Joshua Boniface 094ac8c3a8 Ensure stdout is used 2020-01-06 12:34:35 -05:00
Joshua Boniface 13548b791d Add additional debugging and fix pool_idx loop var 2020-01-06 11:31:22 -05:00
Joshua Boniface e7bc4f7328 Handle empty None-type hostname 2020-01-05 22:46:56 -05:00
Joshua Boniface be20ba02a7 Handle VM states in flush more accurately
We don't want to block forever on a failure, so limit valid waiting
states to just those we know it should be in during a migration.
2020-01-05 15:21:16 -05:00
Joshua Boniface 7311fa561b Fix bad join with new table name 2020-01-04 15:17:27 -05:00
Joshua Boniface bf89050e8b Update userdata table name 2020-01-04 15:10:37 -05:00
Joshua Boniface 20ae2186f9 Run VM state actions in a thread
Prevents blocking the main thread(s) while a VM is changing state. In
particular, this caused some issues with nodes not responding to
cancellation/reversal of a flush/ready state until the previous
migration was finished, which could cause issues. This entire subset of
actions is now threaded and so can run on its own in the background.
2019-12-26 11:08:16 -05:00
Joshua Boniface b3483fa810 Add explicit returns from flush/ready threads 2019-12-26 11:08:00 -05:00
Joshua Boniface 47cf0a8006 Ensure migration out occurs 2019-12-25 21:11:02 -05:00
Joshua Boniface 77db36a891 Ensure migration out occurs 2019-12-25 21:02:46 -05:00