parallelvirtualcluster/pvc

Author	SHA1	Message	Date
Joshua M. Boniface	a66b834ae4	Fix several small bugs	2019-12-19 18:58:53 -05:00
Joshua M. Boniface	b17b7bf22b	Add black magic to minimize ping losses This particular arping interval/count, along with forcing it to run in the foreground, seems to minimize the packet loss when the primary coordinator transitions. Through extensive testing, this value results in the, consistently, least amount of loss: 1-2 pings, at an 0.025s ping interval, return "TTL exceeded", with no other loss, and only when the node the test VM is on is the one switching to secondary state. No other combination of values here, nor tweaks to other parts of the code, seem able to reduce this further, therefore this is likely the best configuration possible.	2019-12-19 18:57:32 -05:00
Joshua M. Boniface	8c252aeecc	Implemented coordinated locked node transitions The previous method was a "throw it in the sea"-type migration with some (very arbitrary) sleep statements thrown in for good measure. Reimplement this with some hard locking. During each phase of the transition, the nodes acquire read/write shared locks to a Zookeeper key so that they can tightly coordinate the actions of transferring each part of the primary state between them. This is done in a subthread to prevent strange blocking issues that were encountered, likely due to business in the existing main thread.	2019-12-19 10:56:34 -05:00
Joshua M. Boniface	0841ddf8b0	Handle integrity errors in DNS aggregator	2019-12-19 10:45:06 -05:00
Joshua M. Boniface	98764f1edd	Clean up some aspects of node switchover	2019-12-18 21:39:40 -05:00
Joshua M. Boniface	23188199cb	Handle failing Patroni events more gracefully	2019-12-18 21:12:22 -05:00
Joshua M. Boniface	2b1b78622e	Fix invalid arping option It made little difference and didn't error, but was incorrect.	2019-12-18 12:06:40 -05:00
Joshua M. Boniface	364ab10673	Add slight delay when stopping the metadata API	2019-12-18 11:56:04 -05:00
Joshua Boniface	39c9f911cc	Increase arping interval to 0.2s	2019-12-15 14:55:34 -05:00
Joshua Boniface	686af31c08	Reduce arping interval to 0.1s	2019-12-15 12:30:45 -05:00
Joshua Boniface	0a94fac407	Fix bugs around passing master Was not passing properly and getting stuck sometimes, so modify the checking and route creation a bit to prevent it. Seems to work.	2019-12-15 00:08:18 -05:00
Joshua Boniface	b3e21a5bf8	Integrate metadata API into node daemon	2019-12-14 16:41:01 -05:00
Joshua Boniface	8c36e7618a	Modify node daemon to follow API	2019-12-14 14:13:26 -05:00
Joshua Boniface	78f053d81f	Recreate network in aggregator if DNS changes	2019-12-13 00:03:47 -05:00
Joshua Boniface	0a8dd30a48	Restart dnsmasq when network details change	2019-12-12 23:51:22 -05:00
Joshua Boniface	6fa828e721	Don't stop the provisioner worker It should probably just be running on all nodes all the time already, but is started when a node first becomes primary.	2019-12-12 23:08:02 -05:00
Joshua Boniface	c1b6ce0ff7	Reorder starting clients	2019-12-12 23:03:34 -05:00
Joshua Boniface	b854d53fab	Add API management to node daemon	2019-12-12 22:59:07 -05:00
Joshua Boniface	88a181b20d	Allow metadata API in nft rules	2019-12-11 17:04:29 -05:00
Joshua Boniface	1fb560e996	Add DNS nameservers to networks	2019-12-08 23:55:45 -05:00
Joshua Boniface	9cb5561e77	Move default NS record to upstream_domain	2019-12-08 23:05:32 -05:00
Joshua Boniface	3471f4e57a	Remove obsolete pvc-nsX and add pvc-ns name Should point towards the floating IP.	2019-12-08 20:20:20 -05:00
Joshua Boniface	356c12db2e	Add ceph df output to pool data Allows additional information visible in the `ceph df` command, including pool free space and used percentage.	2019-12-06 00:47:27 -05:00
Joshua Boniface	531578fd28	Use consistent tense for VM states Replace "failed" with "fail" and "disabled" with "disable" for consistency with the remaining states.	2019-10-23 23:57:59 -04:00
Joshua M. Boniface	040ca33683	Clean up handling of OSD dump command	2019-10-22 12:51:29 -04:00
Joshua M. Boniface	190623bdd9	Use empty string for node limit	2019-10-22 12:32:14 -04:00
Joshua M. Boniface	f0e0a38a20	Fix bug in config element retrieval	2019-10-22 12:30:23 -04:00
Joshua Boniface	237a37015d	Set upstream IP in key if changed	2019-10-21 16:50:41 -04:00
Joshua Boniface	10ae260b92	Properly handle empty node limit	2019-10-17 13:34:11 -04:00
Joshua Boniface	03447d3374	Update copyright string year to include 2019	2019-10-13 12:09:51 -04:00
Joshua Boniface	116013695f	Fix bugs with bad strings	2019-10-12 18:43:29 -04:00
Joshua Boniface	18fc49fc6c	Use node instead of hypervisor consistently	2019-10-12 01:59:08 -04:00
Joshua Boniface	8dc0c8f0ac	Fix minor bugs	2019-10-12 01:36:50 -04:00
Joshua Boniface	5995353597	Implement VM metadata and use it Implements the storing of three VM metadata attributes: 1. Node limits - allows specifying a list of hosts on which the VM must run. This limit influences the migration behaviour of VMs. 2. Per-VM node selectors - allows each VM to have its migration autoselection method specified, to automatically allow different methods per VM based on the administrator's preferences. 3. VM autorestart - allows a VM to be automatically restarted from a stopped state, presumably due to a failure to find a target node (either due to limits or otherwise) during a flush/fence recovery, on the next node unflush/ready state of its home hypervisor. Useful mostly in conjunction with limits to ensure that VMs which were shut down due to there being no valid migration targets are started back up when their node becomes ready again. Includes the full client interaction with these metadata options, including printing, as well as defining a new function to modify this metadata. For the CLI it is set/modified either on `vm define` or via the `vm meta` command. For the API it is set/modified either on a POST to the `/vm` endpoint (during VM definition) or on POST to the `/vm/<vm>` endpoint. For the API this replaces the previous reserved word for VM creation from scratch as this will no longer be implemented in-daemon (see #22). Closes #52	2019-10-12 01:17:39 -04:00
Joshua M. Boniface	76e6b42389	Add clone_volume backend command	2019-10-10 14:09:07 -04:00
Joshua Boniface	983daceaed	Fix shutdown abort during restart Restart state, being different from shutdown, would trigger an abort of the shutdown. Fix this by including restart in the valid states to continue.	2019-09-07 12:08:31 -04:00
Joshua Boniface	7c4d18691a	Implement configurable replcfg (node-side) Implements administrator-selectable replication configurations for new pools in PVC clusters, overriding the default of copies=3,mincopies=2.	2019-08-23 21:58:54 -04:00
Joshua Boniface	267a3d16e5	Bump version to 0.5	2019-08-08 20:56:27 -04:00
Joshua M. Boniface	2880a761c0	Move Ceph command pipe to new location Matching the new /cmd/domain pipe, move Ceph pipe to /cmd/ceph.	2019-08-07 14:47:27 -04:00
Joshua M. Boniface	b7546e3711	Fix bugs in command pipeline for VMs	2019-08-07 14:13:01 -04:00
Joshua M. Boniface	0ff2d7d537	Use shlex for command splitting This will preserve quoted strings, required for the rbd lock commands.	2019-08-07 14:02:57 -04:00
Joshua M. Boniface	a2a630f6a0	Add pipeline for VM lock flush cmd	2019-08-07 13:49:33 -04:00
Joshua M. Boniface	496216321e	Move lock flushing to VMInstance Prepares for reuse of this function via client commands.	2019-08-07 13:36:56 -04:00
Joshua M. Boniface	0446b2db02	Catch exceptions if Patroni is not up	2019-08-07 11:46:58 -04:00
Joshua M. Boniface	7e77752ce5	Add limit to Patroni switchover attempts	2019-08-07 11:46:42 -04:00
Joshua M. Boniface	33a963c2af	Improve fence output on failure and increase delay	2019-08-07 11:35:49 -04:00
Joshua M. Boniface	e92a57606d	Use better forceful arping command Send ARP responses with the source IP in it to force update even if the old primary did not cleanly terminate (during fencing for instance).	2019-08-07 11:29:38 -04:00
Joshua M. Boniface	ef3b6b3723	Arping 3 times instead of 2 During fence 2 is not always enough for the network to recognize the change in primary coordinator.	2019-08-07 11:15:36 -04:00
Joshua M. Boniface	3b27a88128	Allow abort of shutdown state Adds some logic to allow an active shutdown state to be aborted by changing the VM to another state. Useful mostly if a VM is doing funky things and not responding to the shutdown, but the administrator either doesn't want to wait for the timer to expire (forcing an immediate termination) or wishes to abort the shutdown attempt. Fixes #49	2019-08-07 10:58:18 -04:00
Joshua M. Boniface	e2ae58b62c	Add the missing newline to the string compare	2019-08-04 17:00:33 -04:00

1 2 3 4 5 ...

344 Commits