parallelvirtualcluster/pvc

Author	SHA1	Message	Date
Joshua M. Boniface	0587bcbd67	Go back to manual command for OSD stats Using the Ceph library was a disaster here; it had no timeout or way to force it to continue, so keepalives would become stuck and trigger fence storms. Go back to the manual osd dump command with a 2s timeout which is far more reliable and can be adequately terminated if it runs long.	2020-08-12 22:31:25 -04:00
Joshua M. Boniface	42f2dedf6d	Add syntax checking of userdata YAML	2020-08-12 14:09:56 -04:00
Joshua M. Boniface	0d470ae5f6	Work around formatting fail	2020-08-12 12:12:16 -04:00
Joshua M. Boniface	5b5b7d2276	Improve the conditional so it will always work	2020-08-11 23:08:40 -04:00
Joshua M. Boniface	0468eeb531	Support live resizing of running disk volumes This wasn't happening automatically, nor does it happen with qemu-img commands, so we have to manually trigger a libvirt blockResize against the volume. This setup is a little roundabout but seems to work fine.	2020-08-11 21:46:12 -04:00
Joshua M. Boniface	0dd719a682	Use single-quotes so Python isn't confused	2020-08-11 17:24:11 -04:00
Joshua M. Boniface	09c1bb6a46	Increase start delay of flush service	2020-08-11 14:17:35 -04:00
Joshua M. Boniface	e0cb4a58c3	Ensure zk_listener is readded after reconnect	2020-08-11 12:46:15 -04:00
Joshua M. Boniface	099c58ead8	Fix missing char in log message	2020-08-11 12:40:35 -04:00
Joshua M. Boniface	37b23c0e59	Add comments to build-and-deploy.sh	2020-08-11 12:10:28 -04:00
Joshua M. Boniface	0e5c681ada	Clean up imports Make several imports more specific to reduce redundant code imports and improve memory utilization.	2020-08-11 12:09:10 -04:00
Joshua M. Boniface	46ffe352e3	Better handle subthread timeouts in keepalive Prevent the main keepalive thread from getting stuck due to a subthread taking an enormous time. If this happens, the rest of the main keepalive will continue onward, thus ensuring that the main keepalive does not fail for a significant number of cycles, which would cause a fence.	2020-08-11 11:37:26 -04:00
Joshua M. Boniface	5526e13da9	Move all host provisioner steps to a try block Make the provisioner a bit more robust. This way, even if a provisioning step fails, cleanup is still performed this preventing the system from being left in an undefined state requiring manual correction. Addresses #91	2020-08-06 12:27:10 -04:00
Joshua M. Boniface	ccee124c8b	Adjust fence failcount limit to 6 (30s) The previous saving throw limit (3/15s) seems to have been too low. I was observing bizarre failures where a node would be fenced while it was still starting up. Some of this may have been related to Zookeeper connections taking too long, but this was inconsistent. Increase this to 6 saving throws (30s). This provides significantly more time for a node to properly check in on startup before another node fences it. In the real world, 15s vs 30s isn't that big of a downtime change, but prevents false-positive fences.	2020-08-05 22:40:07 -04:00
Joshua M. Boniface	02343079c0	Improve fencing migrate layout Open the option to do this in parallel with some threads	2020-08-05 22:26:01 -04:00
Joshua M. Boniface	37b83aad6a	Add logging and use better conditional	2020-08-05 21:57:36 -04:00
Joshua M. Boniface	876f2424e0	Ensure dead state isn't written erroneously	2020-08-05 21:57:11 -04:00
Joshua M. Boniface	4438dd401f	Add description to example in network add A required field so ensure this is in the example.	2020-08-05 10:35:41 -04:00
Joshua M. Boniface	142743b2c0	Fix erroneous comma	2020-08-05 10:34:30 -04:00
Joshua M. Boniface	bafdcf9f8c	Use new_size to match new_name	2020-08-05 10:25:37 -04:00
Joshua M. Boniface	6fe74b34b2	Use .get for JSON message responses	2020-07-20 12:31:12 -04:00
Joshua M. Boniface	9f86f12f1a	Only parse script_run_args if not None	2020-07-16 02:36:26 -04:00
Joshua M. Boniface	ad45f6097f	Don't output anything if no results and --raw	2020-07-16 02:35:02 -04:00
Joshua M. Boniface	be405caa11	Remove spurious print statement	2020-07-08 13:28:47 -04:00
Joshua M. Boniface	a1ba9d2eeb	Allow specifying arbitrary script_args on CLI Allow the specifying of arbitrary provisioner script install() args on the provisioner create CLI, either overriding or adding additional per-VM arguments to those found in the profile. Reference example is setting a "vm_fqdn" on a per-run basis. Closes #100	2020-07-08 13:18:12 -04:00
Joshua M. Boniface	8fc5299d38	Avoid failing if CPU features are missing	2020-07-08 12:32:42 -04:00
Joshua M. Boniface	37a58d35e8	Implement limiting of node output Closes #98	2020-06-25 11:51:53 -04:00
Joshua M. Boniface	d74f68c904	Add quiet option to CLI Closes #99	2020-06-25 11:09:55 -04:00
Joshua M. Boniface	15e986c158	Support storing client config in override dir	2020-06-25 11:07:01 -04:00
Joshua M. Boniface	5871380e1b	Avoid crashing VM stats thread if domain migrated	2020-06-10 17:10:46 -04:00
Joshua M. Boniface	2967c97f1a	Format and display extra VM statistics	2020-06-07 03:04:36 -04:00
Joshua M. Boniface	4cdf1f7247	Add statistics values to the API	2020-06-07 02:15:33 -04:00
Joshua M. Boniface	deaf138e45	Add stats to VM information	2020-06-07 00:42:11 -04:00
Joshua M. Boniface	654a3cb7fa	Improve debug output and use ceph df util data	2020-06-06 22:52:49 -04:00
Joshua M. Boniface	9b65d3271a	Improve handling of Ceph status gathering Use the Rados library instead of random OS commands, which massively improves the performance of these tasks. Closes #97	2020-06-06 22:30:25 -04:00
Joshua M. Boniface	fba39cb739	Fix broken sorting for pools and volumes	2020-06-06 21:28:54 -04:00
Joshua M. Boniface	598b2025e8	Use Rados and add Ceph entries to pvcnoded.yaml	2020-06-06 21:12:51 -04:00
Joshua M. Boniface	70b787d1fd	Move all VM functions into thread	2020-06-06 15:44:05 -04:00
Joshua M. Boniface	e1310a05f2	Implement recording of VM stats during keepalive	2020-06-06 15:34:03 -04:00
Joshua M. Boniface	2ad6860dfe	Move Ceph statistics gathering into thread	2020-06-06 13:25:02 -04:00
Joshua M. Boniface	cebb4bbc1a	Comment cleanup	2020-06-06 13:20:40 -04:00
Joshua M. Boniface	a672e06dd2	Move fencing to end of keepalive function	2020-06-06 13:19:11 -04:00
Joshua M. Boniface	1db73bb892	Move libvirt closure into previous section	2020-06-06 13:18:37 -04:00
Joshua M. Boniface	c1956072f0	Rename update_zookeeper function to node_keepalive	2020-06-06 12:49:50 -04:00
Joshua M. Boniface	ce60836c34	Allow enforcement of live migration Provides a CLI and API argument to force live migration, which triggers a new VM state "migrate-live". The node daemon VMInstance during migrate will read this flag from the state and, if enforced, will not trigger a shutdown migration. Closes #95	2020-06-06 12:00:44 -04:00
Joshua M. Boniface	b5434ba744	Fix typo in variable name	2020-06-06 11:29:48 -04:00
Joshua M. Boniface	f61d443773	Allow move of migrated VM to current node Will make the migrate permanent instead of throwing an error. Fixes #96	2020-06-06 11:25:10 -04:00
Joshua M. Boniface	da20b4493a	Properly return the function	2020-06-05 15:50:43 -04:00
Joshua M. Boniface	440821b136	Refactor cluster validation into a command wrapper Instead of using group-based validation, which breaks the help context for subcommands, use a decorator to validate the cluster status for each command. The eager help option will then override this decorator for help commands, while enforcing it for others.	2020-06-05 14:49:53 -04:00
Joshua M. Boniface	b9e5b14f94	Update lastnode too if a self-migrate is aborted References #92	2020-06-04 10:28:04 -04:00

... 8 9 10 11 12 ...

2181 Commits