parallelvirtualcluster/pvc

Author	SHA1	Message	Date
Joshua M. Boniface	a088aa4484	Add node log functions to API and CLI	2021-07-18 18:54:28 -04:00
Joshua M. Boniface	323c7c41ae	Implement node logging into Zookeeper Adds the ability to send node daemon logs to Zookeeper to facilitate a command like "pvc node log", similar to "pvc vm log". Each node stores its logs in a separate tree under "/logs" which can then be combined or queried. By default, set by config, only 2000 lines are kept.	2021-07-18 17:11:43 -04:00
Joshua M. Boniface	75fb60b1b4	Add VM list filtering by tag Uses same method as state or node filtering, rather than altering how the main LIMIT field works.	2021-07-14 00:59:20 -04:00
Joshua M. Boniface	9ea9ac3b8a	Revamp tag handling and display Add an additional protected class, limit manipulation to one at a time, and ensure future flexibility. Also makes display consistent with other VM elements.	2021-07-13 22:39:52 -04:00
Joshua M. Boniface	c0a3467b70	Simplify VM metadata reads Directly call the new common getDomainMetadata function to avoid excessive Zookeeper calls for this information.	2021-07-13 19:05:33 -04:00
Joshua M. Boniface	9a199992a1	Add functions for manipulating VM tags Adds tags to schema (v3), to VM definition, adds function to modify tags, adds function to get tags, and adds tags to VM data output. Tags will enable more granular classification of VMs based either on administrator configuration or from automated system events.	2021-07-13 19:05:33 -04:00
Joshua M. Boniface	c76149141f	Only log ZK connections when persistent Prevents spam in the API logs.	2021-07-10 23:35:49 -04:00
Joshua M. Boniface	0699c48d10	Fix bad schema path name	2021-07-09 16:47:09 -04:00
Joshua M. Boniface	4832245d9c	Handle non-RBD disks and non-RBD errors better	2021-07-09 15:48:57 -04:00
Joshua M. Boniface	2138f2f59f	Fail VM removal on disk removal failures Prevents bad states where the VM is "removed" but some of its disks remain due to e.g. stuck watchers. Rearrange the sequence so it goes stop, delete disks, then delete VM, and then return a failure if any of the disk(s) fail to remove, allowing the task to be rerun after fixing the problem.	2021-07-09 15:39:06 -04:00
Joshua M. Boniface	d1d355a96b	Avoid errors if stats data is None	2021-07-09 13:13:54 -04:00
Joshua M. Boniface	c0c9327a7d	Return an empty log if the value is None	2021-07-09 13:08:00 -04:00
Joshua M. Boniface	5ffabcfef5	Avoid failing if we can't get the future data	2021-07-09 13:05:37 -04:00
Joshua M. Boniface	80fe96b24d	Add some additional docstrings	2021-07-07 12:28:08 -04:00
Joshua M. Boniface	80f04ce8ee	Remove connection renewal in state handler Regenerating the ZK connection was fraught with issues, including duplicate connections, strange failures to reconnect, and various other wonkiness. Instead let Kazoo handle states sensibly. Kazoo moves to SUSPENDED state when it loses connectivity, and stays there indefinitely (based on cursory tests). And Kazoo seems to always resume from this just fine on its own. Thus all that hackery did nothing but complicate reconnection. This therefore turns the listener into a purely informational function, providing logs of when/why it failed, and we also add some additional output messages during initial connection and final disconnection.	2021-07-07 11:55:12 -04:00
Joshua M. Boniface	a8c28786dd	Better handle empty ipaths in schema When trying to write to sub-item paths that don't yet exist, the previous method would just blindly write to whatever the root key is, which is never what we actually want. Instead, check explicitly for a "base path" situation, and handle that. Then, if we try to get a subpath that isn't valid, return None. Finally in the various functions, if the path is None, just continue (or return false/None) and (try to) chug along.	2021-07-05 23:35:03 -04:00
Joshua M. Boniface	c45804e8c1	Revert "Return none if a schema path is not found" This reverts commit `b1fcf6a4a5`.	2021-07-05 23:16:39 -04:00
Joshua M. Boniface	b1fcf6a4a5	Return none if a schema path is not found This can cause overwriting of unintended keys, so should not be happening. Will have to find the bugs this causes.	2021-07-05 17:15:55 -04:00
Joshua M. Boniface	a69105569f	Add node PVC version data to Node information Allows API client to see the currently-active version of the node daemon.	2021-07-05 09:57:38 -04:00
Joshua M. Boniface	e44f3d623e	Remove unnecessary try/except blocks from VM reads The zkhandler read() function takes care of ensuring there is a None value returned if these fail, so these aren't required. Makes the code a fair bit more readable here.	2021-07-02 12:01:58 -04:00
Joshua M. Boniface	43009486ae	Move Ceph pool/volume list assembly to thread pool Same reasons as the VM list, though less impactful.	2021-07-01 17:33:13 -04:00
Joshua M. Boniface	58789f1db4	Move VM list assembly to thread pool This helps parallelize the numerous Zookeeper calls a little bit, at least within the bounds of the GIL, to improve performance when getting a large list of VMs. The max_workers value is capped at 32 to avoid causing too many threads during concurrent executions, but still provides a noticeable speedup (on the order of 0.2-0.4 seconds with 75 VMs, scaling up further as counts grow).	2021-07-01 17:32:47 -04:00
Joshua M. Boniface	baf4c3fbc7	Add performance profiler function Usable anywhere that the global daemon "config" parameter can be passed in (e.g. pvcapid/helper.py, pvcnoded/Daemon.py, etc.). Stores results in a subdirectory of the PVC logdir called "profiler" if this directory can be created, or prints results. The debug config parameter ensures that the profiler can be added to functions and not run unless the server is explicitly in debug mode. Might not be useful as I don't initially plan to add this to every function (only when investigating performance problems), but this flexibility allows that to change later.	2021-07-01 14:01:33 -04:00
Joshua M. Boniface	e093efceb1	Add NoNodeError handlers in ZK locks Instead of looping 5+ times acquiring an impossible lock on a nonexistent key, just fail on a different error and return failure immediately. This is likely a major corner case that shouldn't happen, but better to be safe than 500.	2021-07-01 01:17:38 -04:00
Joshua M. Boniface	a080598781	Avoid superfluous ZK exists calls These cause a major (2x) slowdown in read calls since Zookeeper connections are expensive/slow. Instead, just try the thing and return None if there's no key there. Also wrap the children command in similar error handling since that did not exist and could likely cause some bugs at some point.	2021-07-01 01:15:51 -04:00
Joshua M. Boniface	6adaf1f669	Fix incorrect handling of deletions in init	2021-06-29 18:41:02 -04:00
Joshua M. Boniface	f91c07fdcf	Re-add UUID limit matching for full UUIDs This was valuable when passing a full UUID in, so go back to that. Verify first that the limit string is an actual UUID, and then compare against it if applicable.	2021-06-28 12:27:43 -04:00
Joshua M. Boniface	c54f66efa8	Limit match only on VM name I can see no possible reason to want to do limits against UUIDs, but supporting that means match is not what one would expect since a random UUID could match the limit. So only limit based on the name.	2021-06-23 19:17:35 -04:00
Joshua M. Boniface	cd860bae6b	Optimize VM list in API With many VMs this slows down linearly. Rework it a bit so there are fewer calls to getInformationFromXML and so the processing could happen in parallel at some point.	2021-06-23 19:14:26 -04:00
Joshua M. Boniface	07dbd55f03	Use list comprehension to compare against source	2021-06-22 02:31:14 -04:00
Joshua M. Boniface	6cd0ccf0ad	Fix network check on VM config modification	2021-06-22 02:21:55 -04:00
Joshua M. Boniface	e623909a43	Store PHY MAC for VFs and restore after free	2021-06-22 00:56:47 -04:00
Joshua M. Boniface	60e1da09dd	Don't try any shenannegans when updating NICs Trying to do this on the VMInstance side had problems because we can't differentiate the 3 types of migration there. So, just update this in the API side and hope everything goes well. This introduces an edge bug: if a VM is using a macvtap SR-IOV device, and then tries to migrate, and the migrate is aborted, the NIC lists will be inconsistent. When I revamp the VMInstance in the future, I should be able to correct this, but for now we'll have to live with that edgecase.	2021-06-22 00:00:50 -04:00
Joshua M. Boniface	dc560c1dcb	Better handle retcodes in migrate update	2021-06-21 23:46:47 -04:00
Joshua M. Boniface	68c7481aa2	Ensure offline migrations update SR-IOV NIC states	2021-06-21 23:35:52 -04:00
Joshua M. Boniface	24ce361a04	Ensure SR-IOV NIC states are updated on migration	2021-06-21 23:18:34 -04:00
Joshua M. Boniface	eeb83da97d	Add support for SR-IOV NICs to VMs	2021-06-21 23:18:22 -04:00
Joshua M. Boniface	64d1a37b3c	Add PCIe device paths to SR-IOV VF information This will be used when adding VM network interfaces of type hostdev.	2021-06-21 21:08:46 -04:00
Joshua M. Boniface	13cc0f986f	Implement SR-IOV VF config set Also fixes some random bugs, adds proper interface sorting, and assorted tweaks.	2021-06-21 18:40:11 -04:00
Joshua M. Boniface	33195c3c29	Ensure VF list is sorted	2021-06-21 17:11:48 -04:00
Joshua M. Boniface	a697c2db2e	Add SRIOV PF and VF listing to API	2021-06-21 01:42:55 -04:00
Joshua M. Boniface	509afd4d05	Add hostdev net_type to handler as well	2021-06-17 01:52:58 -04:00
Joshua M. Boniface	e7b6a3eac1	Implement SR-IOV PF and VF instances Adds support for the node daemon managing SR-IOV PF and VF instances. PFs are added to Zookeeper automatically based on the config at startup during network configuration, and are otherwise completely static. PFs are automatically removed from Zookeeper, along with all coresponding VFs, should the PF phy device be removed from the configuration. VFs are configured based on the (autocreated) VFs of each PF device, added to Zookeeper, and then a new class instance, SRIOVVFInstance, is used to watch them for configuration changes. This will enable the runtime management of VF settings by the API. The set of keys ensures that both configuration and details of the NIC can be tracked. Most keys are self-explanatory, especially for PFs and the basic keys for VFs. The configuration tree is also self-explanatory, being based entirely on the options available in the `ip link set {dev} vf` command. Two additional keys are also present: `used` and `used_by`, which will be able to track the (boolean) state of usage, as well as the VM that uses a given VIF. Since the VM side implementation will support both macvtap and direct "hostdev" assignments, this will ensure that this state can be tracked on both the VF and the VM side.	2021-06-17 01:33:03 -04:00
Joshua M. Boniface	f540dd320b	Allow VNI for "direct" type vNICs	2021-06-15 00:27:01 -04:00
Joshua M. Boniface	23318524b9	Ensure validate writes a valid schema version	2021-06-14 21:27:37 -04:00
Joshua M. Boniface	5f11b3198b	Fix base schema None issue in handler too	2021-06-14 21:13:40 -04:00
Joshua M. Boniface	20c773413c	Fix bug in snapshot rename	2021-06-14 00:55:26 -04:00
Joshua M. Boniface	49f4feb482	Fix typo bug in key rename	2021-06-14 00:51:45 -04:00
Joshua M. Boniface	30a160d5ff	Fix invalid type_key	2021-06-13 21:20:10 -04:00
Joshua M. Boniface	1cbc66dccf	Fix bugs in lease listing	2021-06-13 21:10:42 -04:00

... 4 5 6 7 8 ...

469 Commits