parallelvirtualcluster/pvc

Author	SHA1	Message	Date
Joshua M. Boniface	7b3e267f7a	Implement bridge_device for bridged VNIs Required due to #64. Bridged networks were being created on top of a vLAN if the Cluster network was a vLAN device, rather than being created on the underlying device. This came from a previous revision of the cluster architecture guidelines where Cluster was supposed to be a raw device rather than a vLAN. This fixed the problem by implementing a configuration field for a "bridge_device", a NIC device that can then have the bridged vLANs created on top of it. Fixes #64	2020-01-06 14:44:56 -05:00
Joshua M. Boniface	094ac8c3a8	Ensure stdout is used	2020-01-06 12:34:35 -05:00
Joshua M. Boniface	13548b791d	Add additional debugging and fix pool_idx loop var	2020-01-06 11:31:22 -05:00
Joshua M. Boniface	e7bc4f7328	Handle empty None-type hostname	2020-01-05 22:46:56 -05:00
Joshua M. Boniface	be20ba02a7	Handle VM states in flush more accurately We don't want to block forever on a failure, so limit valid waiting states to just those we know it should be in during a migration.	2020-01-05 15:21:16 -05:00
Joshua M. Boniface	7311fa561b	Fix bad join with new table name	2020-01-04 15:17:27 -05:00
Joshua M. Boniface	bf89050e8b	Update userdata table name	2020-01-04 15:10:37 -05:00
Joshua M. Boniface	20ae2186f9	Run VM state actions in a thread Prevents blocking the main thread(s) while a VM is changing state. In particular, this caused some issues with nodes not responding to cancellation/reversal of a flush/ready state until the previous migration was finished, which could cause issues. This entire subset of actions is now threaded and so can run on its own in the background.	2019-12-26 11:08:16 -05:00
Joshua M. Boniface	b3483fa810	Add explicit returns from flush/ready threads	2019-12-26 11:08:00 -05:00
Joshua M. Boniface	47cf0a8006	Ensure migration out occurs	2019-12-25 21:11:02 -05:00
Joshua M. Boniface	77db36a891	Ensure migration out occurs	2019-12-25 21:02:46 -05:00
Joshua M. Boniface	9a39d739e8	Ensure we empty of flush_thread	2019-12-25 20:29:17 -05:00
Joshua M. Boniface	a66b834ae4	Fix several small bugs	2019-12-19 18:58:53 -05:00
Joshua M. Boniface	b17b7bf22b	Add black magic to minimize ping losses This particular arping interval/count, along with forcing it to run in the foreground, seems to minimize the packet loss when the primary coordinator transitions. Through extensive testing, this value results in the, consistently, least amount of loss: 1-2 pings, at an 0.025s ping interval, return "TTL exceeded", with no other loss, and only when the node the test VM is on is the one switching to secondary state. No other combination of values here, nor tweaks to other parts of the code, seem able to reduce this further, therefore this is likely the best configuration possible.	2019-12-19 18:57:32 -05:00
Joshua M. Boniface	8c252aeecc	Implemented coordinated locked node transitions The previous method was a "throw it in the sea"-type migration with some (very arbitrary) sleep statements thrown in for good measure. Reimplement this with some hard locking. During each phase of the transition, the nodes acquire read/write shared locks to a Zookeeper key so that they can tightly coordinate the actions of transferring each part of the primary state between them. This is done in a subthread to prevent strange blocking issues that were encountered, likely due to business in the existing main thread.	2019-12-19 10:56:34 -05:00
Joshua M. Boniface	0841ddf8b0	Handle integrity errors in DNS aggregator	2019-12-19 10:45:06 -05:00
Joshua M. Boniface	98764f1edd	Clean up some aspects of node switchover	2019-12-18 21:39:40 -05:00
Joshua M. Boniface	23188199cb	Handle failing Patroni events more gracefully	2019-12-18 21:12:22 -05:00
Joshua M. Boniface	2b1b78622e	Fix invalid arping option It made little difference and didn't error, but was incorrect.	2019-12-18 12:06:40 -05:00
Joshua M. Boniface	364ab10673	Add slight delay when stopping the metadata API	2019-12-18 11:56:04 -05:00
Joshua Boniface	39c9f911cc	Increase arping interval to 0.2s	2019-12-15 14:55:34 -05:00
Joshua Boniface	686af31c08	Reduce arping interval to 0.1s	2019-12-15 12:30:45 -05:00
Joshua Boniface	0a94fac407	Fix bugs around passing master Was not passing properly and getting stuck sometimes, so modify the checking and route creation a bit to prevent it. Seems to work.	2019-12-15 00:08:18 -05:00
Joshua Boniface	b3e21a5bf8	Integrate metadata API into node daemon	2019-12-14 16:41:01 -05:00
Joshua Boniface	8c36e7618a	Modify node daemon to follow API	2019-12-14 14:13:26 -05:00
Joshua Boniface	78f053d81f	Recreate network in aggregator if DNS changes	2019-12-13 00:03:47 -05:00
Joshua Boniface	0a8dd30a48	Restart dnsmasq when network details change	2019-12-12 23:51:22 -05:00
Joshua Boniface	6fa828e721	Don't stop the provisioner worker It should probably just be running on all nodes all the time already, but is started when a node first becomes primary.	2019-12-12 23:08:02 -05:00
Joshua Boniface	c1b6ce0ff7	Reorder starting clients	2019-12-12 23:03:34 -05:00
Joshua Boniface	b854d53fab	Add API management to node daemon	2019-12-12 22:59:07 -05:00
Joshua Boniface	88a181b20d	Allow metadata API in nft rules	2019-12-11 17:04:29 -05:00
Joshua Boniface	1fb560e996	Add DNS nameservers to networks	2019-12-08 23:55:45 -05:00
Joshua Boniface	9cb5561e77	Move default NS record to upstream_domain	2019-12-08 23:05:32 -05:00
Joshua Boniface	3471f4e57a	Remove obsolete pvc-nsX and add pvc-ns name Should point towards the floating IP.	2019-12-08 20:20:20 -05:00
Joshua Boniface	356c12db2e	Add ceph df output to pool data Allows additional information visible in the `ceph df` command, including pool free space and used percentage.	2019-12-06 00:47:27 -05:00
Joshua Boniface	531578fd28	Use consistent tense for VM states Replace "failed" with "fail" and "disabled" with "disable" for consistency with the remaining states.	2019-10-23 23:57:59 -04:00
Joshua M. Boniface	040ca33683	Clean up handling of OSD dump command	2019-10-22 12:51:29 -04:00
Joshua M. Boniface	190623bdd9	Use empty string for node limit	2019-10-22 12:32:14 -04:00
Joshua M. Boniface	f0e0a38a20	Fix bug in config element retrieval	2019-10-22 12:30:23 -04:00
Joshua Boniface	237a37015d	Set upstream IP in key if changed	2019-10-21 16:50:41 -04:00
Joshua Boniface	10ae260b92	Properly handle empty node limit	2019-10-17 13:34:11 -04:00
Joshua Boniface	03447d3374	Update copyright string year to include 2019	2019-10-13 12:09:51 -04:00
Joshua Boniface	116013695f	Fix bugs with bad strings	2019-10-12 18:43:29 -04:00
Joshua Boniface	18fc49fc6c	Use node instead of hypervisor consistently	2019-10-12 01:59:08 -04:00
Joshua Boniface	8dc0c8f0ac	Fix minor bugs	2019-10-12 01:36:50 -04:00
Joshua Boniface	5995353597	Implement VM metadata and use it Implements the storing of three VM metadata attributes: 1. Node limits - allows specifying a list of hosts on which the VM must run. This limit influences the migration behaviour of VMs. 2. Per-VM node selectors - allows each VM to have its migration autoselection method specified, to automatically allow different methods per VM based on the administrator's preferences. 3. VM autorestart - allows a VM to be automatically restarted from a stopped state, presumably due to a failure to find a target node (either due to limits or otherwise) during a flush/fence recovery, on the next node unflush/ready state of its home hypervisor. Useful mostly in conjunction with limits to ensure that VMs which were shut down due to there being no valid migration targets are started back up when their node becomes ready again. Includes the full client interaction with these metadata options, including printing, as well as defining a new function to modify this metadata. For the CLI it is set/modified either on `vm define` or via the `vm meta` command. For the API it is set/modified either on a POST to the `/vm` endpoint (during VM definition) or on POST to the `/vm/<vm>` endpoint. For the API this replaces the previous reserved word for VM creation from scratch as this will no longer be implemented in-daemon (see #22). Closes #52	2019-10-12 01:17:39 -04:00
Joshua M. Boniface	76e6b42389	Add clone_volume backend command	2019-10-10 14:09:07 -04:00
Joshua Boniface	983daceaed	Fix shutdown abort during restart Restart state, being different from shutdown, would trigger an abort of the shutdown. Fix this by including restart in the valid states to continue.	2019-09-07 12:08:31 -04:00
Joshua Boniface	7c4d18691a	Implement configurable replcfg (node-side) Implements administrator-selectable replication configurations for new pools in PVC clusters, overriding the default of copies=3,mincopies=2.	2019-08-23 21:58:54 -04:00
Joshua Boniface	267a3d16e5	Bump version to 0.5	2019-08-08 20:56:27 -04:00

1 2 3 4 5

243 Commits