parallelvirtualcluster/pvc - pvc

Commit Graph

Author	SHA1	Message	Date
Joshua Boniface	44232fe3c6	Fix export swagger definition	2024-08-20 11:07:56 -04:00
Joshua Boniface	faf920ac1d	Fix bug where force_flag is a string	2024-08-20 10:10:33 -04:00
Joshua Boniface	d060787503	Add initial implementation of snapshot export	2024-08-19 18:46:07 -04:00
Joshua Boniface	0cf229273a	Add API endpoint for current primary node This was never exposed before, so expose it for use in other functions being built.	2024-08-19 17:15:52 -04:00
Joshua Boniface	212ecaab68	Fix Swagger doc issues	2024-08-19 16:56:18 -04:00
Joshua Boniface	33f905459a	Implement VM rollback Closes #184	2024-08-16 10:47:18 -04:00
Joshua Boniface	fbd5b3cca3	Remove is_backup flag for snapshots This won't be needed for anything.	2024-08-16 10:46:25 -04:00
Joshua Boniface	6fc7f45027	Add snapshot lists and timestamp Adds snapshots to the list of data in VM objects	2024-08-16 10:46:25 -04:00
Joshua Boniface	0c240a5129	Add VM snapshot removal	2024-08-16 10:46:25 -04:00
Joshua Boniface	553c1e670e	Add VM snapshots functionality Adds the ability to create snapshots of an entire VM, including all its RBD disks and the VM XML config, though not any PVC metadata.	2024-08-16 10:46:25 -04:00
Joshua Boniface	5d0e7931d1	Add support for rolling back snapshots We supported creating snapshots, but not doing anything with them. This removes the manual task of restoring a snapshot and replace it with a PVC abstraction of rolling back to a snapshot. While Ceph recommends cloning a snapshot instead of rolling back, due to the time taken, in our usecase I don't think that is an optimal strategy, as it will leave dangling clones that we'd then have to manage. Closes #183	2024-05-13 15:24:51 -04:00
Joshua Boniface	3bc500bc55	Permit duplicate VNIs in templates with flag Supports niche usecases whereby a network template should contain the same VNI(s) more than once.	2024-02-09 12:12:04 -05:00
Joshua Boniface	a95e72008e	Add size validations for volume clones Adds the same validations as a volume add or resize to volume clones, to ensure there is enough free space for them.	2024-02-02 11:37:29 -05:00
Joshua Boniface	efc7434143	Add safety check for 80% full size Adds a check that a volume creation or resize won't violate the 80% full rule for the storage cluster. This ensures a cluster won't get too full if a storage volume fills up. Also adds a force flag throughout the pipeline to override this check, should an administrator really want to do so. Closes #177	2024-02-02 11:37:00 -05:00
Joshua Boniface	09269f182c	Add live migrate max downtime selector meta field Adds a new flag to VM metadata to allow setting the VM live migration max downtime. This will enable very busy VMs that hang live migration to have this value changed.	2024-01-11 00:05:50 -05:00
Joshua Boniface	3c458ca9a6	Fix broken result backend on old Celery	2024-01-09 12:06:23 -05:00
Joshua Boniface	123c7ce857	Update copyright header on all files for 2024 Last release of 2023 is probably the best time to do this.	2023-12-29 11:16:59 -05:00
Joshua Boniface	4969e90f8a	Allow enable/disable of Prometheus endpoints Since these are unauthenticated, it might be the case that an administrator wishes to completely disable these metrics endpoints. Provide that option via pvc.conf through pvc-ansible's existing enable_prometheus_exporters option and the new enable_prometheus configuration flag. Defaults to "yes" to provide all functionality unless explicitly disabled, as the author assumes that the PVC API is secured in other ways as well and that metric information is not completely sensitive.	2023-12-29 09:25:10 -05:00
Joshua Boniface	0bcf8cfe19	Add Zookeeper metrics proxy	2023-12-28 13:53:15 -05:00
Joshua Boniface	3e4cc53fdd	Add node network statistics and utilization values Adds a new physical network interface stats parser to the node keepalives, and leverages this information to provide a network utilization overview in the Prometheus metrics.	2023-12-21 15:45:01 -05:00
Joshua Boniface	39f9f3640c	Rename health metrics and add resource metrics	2023-12-21 09:40:49 -05:00
Joshua Boniface	4ca2381077	Rework metrics output and add combined endpoint	2023-12-09 15:47:40 -05:00
Joshua Boniface	9b3c9f1be5	Add Ceph metrics proxy and health fault counts	2023-12-09 12:22:36 -05:00
Joshua Boniface	7373bfed3f	Add Prometheus metric exporter Adds a "fake" Prometheus metrics endpoint which returns cluster status information in Prometheus format.	2023-12-09 12:22:36 -05:00
Joshua Boniface	20acf3295f	Add mass ack/delete of faults	2023-12-06 13:59:39 -05:00
Joshua Boniface	672e58133f	Implement interfaces to faults	2023-12-04 01:37:54 -05:00
Joshua Boniface	102c3c3106	Port all Celery worker functions to discrete pkg Moves all tasks run by the Celery worker into a discrete package/module for easier installation. Also adjusts several parameters throughout to accomplish this.	2023-11-30 02:24:54 -05:00
Joshua Boniface	0c0fb65c62	Rework Flask API to route Celery tasks manually Avoids needing to define any of these tasks here; they can all be defined in the pvcworkerd code.	2023-11-30 00:40:09 -05:00
Joshua Boniface	03a738f878	Move config parser into daemon_lib And reformat/add config values for API.	2023-11-30 00:05:37 -05:00
Joshua Boniface	24cabd3b99	Fix missing result_backend on Debian 10/11 For whatever reason, a Celery worker on <5.2.x was not picking these up. Move them back to the root of the module so they are properly picked up on these older versions but still prevents calling the routing functions during an API doc generation.	2023-11-25 15:35:25 -05:00
Joshua Boniface	b66cfb07d8	Isolate cluster-dependent Celery startup Avoids calling unworkable functions when generating API docs etc. by isolating them into a Celery startup function called by Daemon.py. Also update to Celery 4+ settings format.	2023-11-16 20:32:29 -05:00
Joshua Boniface	9ab505ec98	Return and show task_name	2023-11-16 14:50:02 -05:00
Joshua Boniface	0cb81f96e6	Use custom task IDs for Celery tasks Full UUIDs were obnoxiously long, so switch to using just the first 8-character section of a UUID instead. Keeps the list nice and short, makes them easier to copy, and is just generally nicer. Could this cause uniqueness problems? Perhaps, but I don't see that happening nearly frequently enough to matter.	2023-11-16 13:22:14 -05:00
Joshua Boniface	d226e9f4e5	Enable extended Celery results	2023-11-16 12:02:57 -05:00
Joshua Boniface	fa361a55d9	Explicitly use kwargs in Celery task calls	2023-11-16 11:55:30 -05:00
Joshua Boniface	262babc63d	Use kwargs for all task arguments This will help ensure that the CLI frontend can properly parse the args in a consistent way.	2023-11-16 10:10:48 -05:00
Joshua Boniface	289049d223	Properly handle a "primary" run_on value	2023-11-16 02:49:29 -05:00
Joshua Boniface	0bec6abe71	Return proper run_on for ported tasks	2023-11-16 02:28:57 -05:00
Joshua Boniface	484e6542c2	Port remaining tasks to new task handler Move the create_vm and run_benchmark tasks to use the new Celery subsystem, handlers, and wait command. Remove the obsolete, dedicated API endpoints. Standardize the CLI client and move the repeated handler code into a separate common function.	2023-11-16 02:00:23 -05:00
Joshua Boniface	ce17c60a20	Port OSD on-node tasks to Celery worker system Adds Celery versions of the osd_add, osd_replace, osd_refresh, osd_remove, and osd_db_vg_add functions.	2023-11-09 23:28:08 -05:00
Joshua Boniface	89681d54b9	Port VM on-node tasks to Celery worker system Adds Celery versions of the flush_locks, device_attach, and device_detach functions.	2023-11-06 20:40:46 -05:00
Joshua Boniface	3dc1f57de2	Revert "Switch to ZK+PG over Redis for Celery queue" This reverts commit `54215bab6c`.	2023-11-05 17:10:46 -05:00
Joshua Boniface	af8a8d969e	Ensure queues are set up for non-coordinator nodes Allows a runner to operate on every possible node, not just coordinators, as OSDs or other things could be on any node. Also add more comments.	2023-11-04 15:05:07 -04:00
Joshua Boniface	a6caac1b78	Add Celery queue routing for tasks By default, tasks will continue to run as they did, on the primary coordinator's task runner. However this opens the possibility for defining more tasks that will run on other nodes or coordinators.	2023-11-04 14:29:59 -04:00
Joshua Boniface	54215bab6c	Switch to ZK+PG over Redis for Celery queue Redis did not provide a distributed solution for the worker, which precluded several important planned functions. So instead, move to using Zookeeper + PostgreSQL as the broker and result backend respectively. Should be a seamless drop-in change but for future uses requires the database host to be the primary coordinator IP rather than localhost, so that writes can occur to the database from non-primary hosts.	2023-11-04 12:46:34 -04:00
Joshua Boniface	64e37ae963	Update OSD replacement functionality 1. Simplify this by leveraging the existing remove_osd/add_osd functions, since its task was functionally identical to those two in sequential order. 2. Add support for split OSDs within the command (replacing all OSDs on the block device(s) as required). 3. Add additional configurability and flexibility around the old device, weight, and external DB LVs.	2023-11-03 01:45:49 -04:00
Joshua Boniface	980ea6a9e9	Adjust handling of ext_db and _count options Avoid the use of superfluous flag options, default them to none, and add support for fixed-size DB LVs.	2023-11-02 13:29:47 -04:00
Joshua Boniface	526a5f4a74	Add support for split OSD adds Allows creating multiple OSDs on a single (NVMe) block device, leveraging the "ceph-volume lvm batch" command. Replaces the previous method of creating OSDs. Also adds a new ZK item for each OSD indicating if it is split or not.	2023-11-01 21:31:35 -04:00
Joshua Boniface	c87736eb0a	Use consistent path name and format	2023-10-24 01:20:44 -04:00
Joshua Boniface	63d0a85e29	Add backup deletion command	2023-10-24 01:18:27 -04:00

1 2 3 4

168 Commits