parallelvirtualcluster/pvc

Author	SHA1	Message	Date
Joshua M. Boniface	5d0e7931d1	Add support for rolling back snapshots We supported creating snapshots, but not doing anything with them. This removes the manual task of restoring a snapshot and replace it with a PVC abstraction of rolling back to a snapshot. While Ceph recommends cloning a snapshot instead of rolling back, due to the time taken, in our usecase I don't think that is an optimal strategy, as it will leave dangling clones that we'd then have to manage. Closes #183	2024-05-13 15:24:51 -04:00
Joshua M. Boniface	ab944f9b95	Add RBD snap purge during volume removal Fixes #180	2024-04-19 10:31:11 -04:00
Joshua M. Boniface	9714ac20b2	Update formatting for Black 24.4.0	2024-04-19 10:26:06 -04:00
Joshua M. Boniface	a461791ce8	Fix bug cleaning up successful benchmark results	2024-03-08 14:22:07 -05:00
Joshua M. Boniface	9fdb6d8708	Fix bug with network stats	2024-03-07 15:44:35 -05:00
Joshua M. Boniface	2fb7c40497	Work around bad plugin data	2024-03-07 14:37:05 -05:00
Joshua M. Boniface	67ec41aaf9	Fix invalid memory errors for stopped VMs	2024-02-06 13:30:48 -05:00
Joshua M. Boniface	a95e72008e	Add size validations for volume clones Adds the same validations as a volume add or resize to volume clones, to ensure there is enough free space for them.	2024-02-02 11:37:29 -05:00
Joshua M. Boniface	efc7434143	Add safety check for 80% full size Adds a check that a volume creation or resize won't violate the 80% full rule for the storage cluster. This ensures a cluster won't get too full if a storage volume fills up. Also adds a force flag throughout the pipeline to override this check, should an administrator really want to do so. Closes #177	2024-02-02 11:37:00 -05:00
Joshua M. Boniface	8419659e1b	Ensure zkhandler is always cleaned up Even if the subfunction of an API @ZKConnection call fails, the zkhandler needs to terminate and clean up, or it leaves stuck threads around.	2024-01-30 09:48:17 -05:00
Joshua M. Boniface	d28fb71f57	Fix incorrect variable set	2024-01-24 14:40:40 -05:00
Joshua M. Boniface	09269f182c	Add live migrate max downtime selector meta field Adds a new flag to VM metadata to allow setting the VM live migration max downtime. This will enable very busy VMs that hang live migration to have this value changed.	2024-01-11 00:05:50 -05:00
Joshua M. Boniface	362edeed8c	Add backup reporting and improve metrics Major improvements to autobackup and backups, including additional information/fields in the backup JSON itself, improved error handling, and the ability to email reports of autobackups using a local sendmail utility.	2024-01-10 14:18:44 -05:00
Joshua M. Boniface	39c8367723	Add additional metainfo to VM backups Adds additional information about failures, runtime, file sizes, etc. to the JSON output of a VM backup. This helps enable additional reporting and summary information for autobackup runs.	2024-01-10 10:37:29 -05:00
Joshua M. Boniface	aac306c55b	Fix missing stats on old Debians	2024-01-09 12:10:16 -05:00
Joshua M. Boniface	c1ae571213	Add additional VM details to Prometheus	2023-12-29 14:09:39 -05:00
Joshua M. Boniface	123c7ce857	Update copyright header on all files for 2024 Last release of 2023 is probably the best time to do this.	2023-12-29 11:16:59 -05:00
Joshua M. Boniface	4969e90f8a	Allow enable/disable of Prometheus endpoints Since these are unauthenticated, it might be the case that an administrator wishes to completely disable these metrics endpoints. Provide that option via pvc.conf through pvc-ansible's existing enable_prometheus_exporters option and the new enable_prometheus configuration flag. Defaults to "yes" to provide all functionality unless explicitly disabled, as the author assumes that the PVC API is secured in other ways as well and that metric information is not completely sensitive.	2023-12-29 09:25:10 -05:00
Joshua M. Boniface	3346ce9bb0	Add missing shutdown state from combinations	2023-12-27 13:40:30 -05:00
Joshua M. Boniface	e654fbba08	Move debug condition handling to Logger Avoids many dozens of conditionals sprinkled throughout the code by centralizing this check into the main Logger instance.	2023-12-27 13:01:45 -05:00
Joshua M. Boniface	4375f66793	Use proper get() for invalid values	2023-12-27 12:03:48 -05:00
Joshua M. Boniface	3df3ca5b44	Fix value for OSD utilization Ceph provides in KB; convert to bytes.	2023-12-27 11:56:50 -05:00
Joshua M. Boniface	431ee69620	Use proper percentage for pool util	2023-12-27 10:03:00 -05:00
Joshua M. Boniface	88f4d79d5a	Handle invalid values on older Libvirt versions	2023-12-27 09:51:24 -05:00
Joshua M. Boniface	84d22751d8	Fix bad JSON data handler	2023-12-27 09:43:37 -05:00
Joshua M. Boniface	40ff005a09	Fix handling of Ceph OSD bytes	2023-12-26 12:43:51 -05:00
Joshua M. Boniface	9604f655d0	Improve node utilization metrics and fix bugs	2023-12-25 02:47:41 -05:00
Joshua M. Boniface	3e4cc53fdd	Add node network statistics and utilization values Adds a new physical network interface stats parser to the node keepalives, and leverages this information to provide a network utilization overview in the Prometheus metrics.	2023-12-21 15:45:01 -05:00
Joshua M. Boniface	d2d2a9c617	Include our newline atomically Sometimes clashing log entries would print on the same line, likely due to some sort of race condition in Python's print() built-in. Instead, add a newline to our actual message and print without an end character. This ensures atomic printing of our log messages.	2023-12-21 13:12:43 -05:00
Joshua M. Boniface	6ed4efad33	Add new network.stats key to nodes	2023-12-21 12:48:48 -05:00
Joshua M. Boniface	39f9f3640c	Rename health metrics and add resource metrics	2023-12-21 09:40:49 -05:00
Joshua M. Boniface	c64e888d30	Fix incorrect cast of None	2023-12-14 16:00:53 -05:00
Joshua M. Boniface	f1249452e5	Fix bug if no nodes are present	2023-12-14 15:32:18 -05:00
Joshua M. Boniface	f41c5176be	Ensure health value is an int properly	2023-12-13 14:34:02 -05:00
Joshua M. Boniface	ed9c37982a	Move metric collection into daemon library	2023-12-11 19:20:30 -05:00
Joshua M. Boniface	57c28376a6	Port one final Ceph function to read_many	2023-12-11 10:25:36 -05:00
Joshua M. Boniface	e781d742e6	Fix bug with volume and snapshot listing	2023-12-11 10:21:46 -05:00
Joshua M. Boniface	741dafb26b	Port VM functions to read_many	2023-12-11 03:34:36 -05:00
Joshua M. Boniface	5d9e83e8ed	Fix output bugs in VM information	2023-12-11 03:04:46 -05:00
Joshua M. Boniface	7c116b2fbc	Ensure node health value is an int	2023-12-10 23:56:50 -05:00
Joshua M. Boniface	1023c55087	Fix bug in VM state list	2023-12-10 23:44:01 -05:00
Joshua M. Boniface	9235187c6f	Port Ceph functions to read_many Only ports getOSDInformation, as all the others feature 3 or less reads which is acceptable sequentially.	2023-12-10 22:24:38 -05:00
Joshua M. Boniface	0c94f1b4f8	Port Network functions to read_many	2023-12-10 22:19:21 -05:00
Joshua M. Boniface	44a4f0e1f7	Use new info detail output instead of new lists Avoids multiple additional ZK calls by using data that is now in the status detail output.	2023-12-10 22:19:09 -05:00
Joshua M. Boniface	5d53a3e529	Add state and faults detail to cluster information We already parse this information out anyways, so might as well add it to the API output JSON. This can be leveraged by the Prometheus endpoint as well to avoid duplicate listings.	2023-12-10 17:29:32 -05:00
Joshua M. Boniface	35e22cb50f	Simplify cluster status handling This significantly simplifies cluster state handling by removing most of the superfluous get_list() calls, replacing them with basic child reads since most of them are just for a count anyways. The ones that require states simplify this down to a child read plus direct reads for the exact items required while leveraging the new read_many() function.	2023-12-10 17:05:46 -05:00
Joshua M. Boniface	a3171b666b	Split node health into separate function	2023-12-10 16:52:10 -05:00
Joshua M. Boniface	48e41d7b05	Port Faults getFault and getAllFaults to read_many	2023-12-10 16:05:16 -05:00
Joshua M. Boniface	d6aecf195e	Port Node getNodeInformation to read_many	2023-12-10 15:53:28 -05:00
Joshua M. Boniface	9329784010	Implement async ZK read function Adds a function, "read_many", which can take in multiple ZK keys and return the values from all of them, using asyncio to avoid reading sequentially. Initial tests show a marked improvement in read performance of multiple read()-heavy functions (e.g. "get_list()" functions) with this method.	2023-12-10 15:35:40 -05:00

1 2 3 4 5 ...

400 Commits