parallelvirtualcluster/pvc

Author	SHA1	Message	Date
Joshua M. Boniface	8a28738bff	Use consistent terminology in fence message	2019-07-10 11:54:56 -04:00
Joshua M. Boniface	8f160abf90	Handle cancelling flushes when new ones run Store the flush_thread of a node as a class object. Before starting a new flush thread (either flush or unflush), stop the existing one if it exists to prevent further migrations, then start the new thread. Set the object to None on init and again once the task actually finishes. Remove the inflush flag as this is not required when using these threads and functionally does nothing any longer, but add the flush_stopper flag to trigger cancellation of the current job.	2019-07-10 11:54:34 -04:00
Joshua M. Boniface	c7c8c8bcbb	Fix bug with flush	2019-07-10 00:43:55 -04:00
Joshua M. Boniface	7a8aee9fe7	Remove flush locking functionality This just seemed like more trouble that it was worth. Flush locks were originally intended as a way to counteract the weird issues around flushing that were mostly fixed by the code refactoring, so this will help test if those issues are truly gone. If not, will look into a cleaner solution that doesn't result in unchangeable states.	2019-07-09 23:59:17 -04:00
Joshua M. Boniface	ad284b13bc	Fix bugs with fencing	2019-07-09 19:17:53 -04:00
Joshua M. Boniface	7df200ac44	Improve ZK connection loss handling	2019-07-09 19:17:32 -04:00
Joshua M. Boniface	47f86475f8	Handle failures of Ceph commands gradefully If these commands fail, catch the error, print a message, and set up empty lists. Also handle later data parsing in this case.	2019-07-09 16:43:38 -04:00
Joshua M. Boniface	1a8e7509f7	Support run_os_command timeout; use timeouts	2019-07-09 15:09:13 -04:00
Joshua M. Boniface	83a4140703	Allow enabling debug mode in config Makes debugging easier without modifying code.	2019-07-09 14:59:00 -04:00
Joshua M. Boniface	8eeba9bc9b	Make Ceph commands time out if needed	2019-07-09 14:35:53 -04:00
Joshua M. Boniface	19701c66e4	Move fencing to after keepalive output Just makes the messages a little easier to read when triggered.	2019-07-09 14:24:31 -04:00
Joshua M. Boniface	17dfaf43c5	Move hypervisor selection out to common	2019-07-09 14:20:58 -04:00
Joshua M. Boniface	b551b54642	Rename message when contending	2019-07-09 14:03:48 -04:00
Joshua M. Boniface	4249d5d982	Always load and store IPMI on daemon start Without this, the IPMI information set during initial node creation can never be changed, which can cause issues later. Instead, always set it fresh on each node boot.	2019-07-09 14:00:31 -04:00
Joshua M. Boniface	7f828a27a5	Free RBD locks when fencing node	2019-07-09 10:59:31 -04:00
Joshua M. Boniface	bc54ea2449	Log message when starting or stopping API client	2019-07-08 19:29:49 -04:00
Joshua M. Boniface	cda690e94f	Set RADOS df information in ZK	2019-07-08 10:19:56 -04:00
Joshua M. Boniface	d9ebd04264	Fix missing dom_uuid values in data reads	2019-07-07 15:30:28 -04:00
Joshua M. Boniface	b82ccaa84d	Improve flush handling Similar to recent client changes, don't replace the previous node record of an already-migrated VM. Wait for shutdown if required. Use a continue statement instead of a needless else block.	2019-07-07 15:27:37 -04:00
Joshua M. Boniface	0d398f663b	Rename "Domain" to "VM" in various class names The name "Domain", though technically correct from a Libvirt perspective, was unnecessarily confusing. Call the class instances what they are, VMs.	2019-07-07 15:20:37 -04:00
Joshua M. Boniface	8216125b02	Enable autostart of API client on Primary Adds a config flag that turns on the API client following the Primary coordinator. The retcode of the start/stop commands is ignore so this can fail gracefully if e.g. the client isn't installed.	2019-07-06 02:42:56 -04:00
Joshua M. Boniface	3e591bd09e	Remove extra whitespaces on blank lines	2019-06-25 22:33:23 -04:00
Joshua M. Boniface	08cb16bfbc	Revamp VM migration handling This was very old code that was hard to follow and quite fragile, with failures and infinite loops occurring fairly frequently. These changes make the code more robust, including the addition of timeouts, some code cleanup, and some improvements to the logical flow. Also forces the libvirt migration to occur on the cluster network, which couples to changes in the libvirtd listen (via pvc-ansible) and in Daemon.py via the previous commit.	2019-06-25 22:23:48 -04:00
Joshua M. Boniface	d336fce253	Connect to actual IP not localhost for Libvirt	2019-06-25 22:09:32 -04:00
Joshua M. Boniface	75d0e7f989	Revert "Only perform fencing duties on primary" This reverts commit `464c69aac6`. Actually, yea, this made sense - if the primary fails, it can't fence itself.	2019-06-25 12:36:48 -04:00
Joshua M. Boniface	85a5a8a0c9	Disable tx offloading on bridge interfaces Reference: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=717215#68 Without this, DHCP fails when traversing only the local bridge, for Debian Jessie or earlier (and possibly other OSes as well), due to the missing UDP checksums. This disables the offload and hence reenables the checksums even on the software-only bridge. Also rearranged the steps and added comments arround this section to better clarify what each command is doing.	2019-06-25 12:36:37 -04:00
Joshua M. Boniface	464c69aac6	Only perform fencing duties on primary There was really no need for this to be shared among all the coordinators, which seemed more fragile. This way only the primary will try to fence dead nodes.	2019-06-24 20:17:51 -04:00
Joshua Boniface	249611b161	Remove duplicate import	2019-06-24 20:14:43 -04:00
Joshua M. Boniface	ef272b0b7d	Add removal confirmations and zap disk before add	2019-06-21 15:52:28 -04:00
Joshua M. Boniface	867ad1fc1b	Support human-readable biconversion and in volumes	2019-06-21 09:23:52 -04:00
Joshua M. Boniface	ddedb1a992	Set image features to supported values	2019-06-19 15:19:36 -04:00
Joshua M. Boniface	0f15e7cda5	Set shutdown state after final keepalive	2019-06-19 14:52:47 -04:00
Joshua M. Boniface	0060c0313b	Put daemonstate to shutdown when stopping This way it isn't "run" all the way until it shuts down.	2019-06-19 14:23:07 -04:00
Joshua M. Boniface	9a0554fdbe	Remove all volumes from pool on removal Technically not needed, but otherwise random errors may be thrown, so best to be explicit.	2019-06-19 12:49:03 -04:00
Joshua M. Boniface	87907d4ce8	Remove size field from volume objects This data is just in the stats anyways.	2019-06-19 10:45:14 -04:00
Joshua M. Boniface	09562fdc06	Output in json format instead	2019-06-19 10:32:01 -04:00
Joshua M. Boniface	a940d03959	Fix some bugs and add RBD volume stats	2019-06-19 10:25:22 -04:00
Joshua M. Boniface	db0b382b3d	Don't bother with snapshot management by Daemon This is definitely not needed in the end, and just uses RAM for no conceivable purpose. Snapshots are fully client-managed.	2019-06-19 09:43:04 -04:00
Joshua M. Boniface	1c9f606480	Implement volume and snapshot handling by daemon This seems like a super-gross way to do this, but at the moment I don't have a better way. Maybe just remove this component since none of the volume/snapshot stuff is dynamic; will see as this progresses.	2019-06-19 09:40:32 -04:00
Joshua M. Boniface	784b428ed0	Add creation of volume and snapshot lists	2019-06-19 09:29:36 -04:00
Joshua M. Boniface	064e6455bc	Correct some more bugs	2019-06-19 00:29:21 -04:00
Joshua M. Boniface	a4ab3075ab	Correct some bugs around new code	2019-06-19 00:23:25 -04:00
Joshua M. Boniface	01959cb9e3	Implementation of RBD volumes and snapshots Adds the ability to manage RBD volumes (add/remove) and RBD snapshots (add/remove). (Working) list functions to come.	2019-06-19 00:12:44 -04:00
Joshua M. Boniface	2bbbda3da5	Only trigger pool updates on primary	2019-06-18 21:26:05 -04:00
Joshua M. Boniface	612f5ab52c	Strip pv_block from stdout	2019-06-18 20:34:25 -04:00
Joshua M. Boniface	1622226c32	Add more logging during OSD creation/deletion	2019-06-18 20:31:04 -04:00
Joshua M. Boniface	3adeef6fdd	Use the fsid to activate new OSDs	2019-06-18 20:22:28 -04:00
Joshua M. Boniface	443108f53d	Add support for enable/disable keepalive detail	2019-06-18 19:54:42 -04:00
Joshua M. Boniface	79f284a0a9	Pass logger into run_command	2019-06-18 13:45:59 -04:00
Joshua M. Boniface	080ca3201c	Correct actual problem with this_node	2019-06-18 13:43:54 -04:00

1 2 3 4 5 ...

261 Commits