parallelvirtualcluster/pvc

Author	SHA1	Message	Date
Joshua M. Boniface	79f7e8f82e	Skip "run_on" argument in output This isn't required to know, it's internal.	2023-11-16 11:46:15 -05:00
Joshua M. Boniface	0d818017e8	Name the celery workers pvcworkerd@<hostname>	2023-11-16 11:43:17 -05:00
Joshua M. Boniface	eb1d61a8b9	Generalize task status output	2023-11-16 11:39:08 -05:00
Joshua M. Boniface	262babc63d	Use kwargs for all task arguments This will help ensure that the CLI frontend can properly parse the args in a consistent way.	2023-11-16 10:10:48 -05:00
Joshua M. Boniface	63773a3061	Allow watching existing task via cluster task	2023-11-16 03:06:13 -05:00
Joshua M. Boniface	289049d223	Properly handle a "primary" run_on value	2023-11-16 02:49:29 -05:00
Joshua M. Boniface	e818df5dae	Use enable/disable --now instead of two commands Avoids needing two calls here especially for the stop.	2023-11-16 02:40:35 -05:00
Joshua M. Boniface	c76a5afd04	Avoid waits during node secondary Waiting for the daemons to stop took too much time on some nodes and could throw off the lockstep. Instead, leverage background=True to run the systemctl os_commands in the background (when they complete is irrelevant), stop the Metadata API first, and don't delay during its stop at all.	2023-11-16 02:34:12 -05:00
Joshua M. Boniface	0bec6abe71	Return proper run_on for ported tasks	2023-11-16 02:28:57 -05:00
Joshua M. Boniface	18e43a9377	Adjust name in worker log output	2023-11-16 02:25:14 -05:00
Joshua M. Boniface	4555f5a20a	Remove warnings when switch coordinator state Tasks are no longer bound to the primary coordinator for state updates due to using KeyDB and a proper shared queue and result backend, so this warning is now obsolete and no longer required. This would interrupt "--wait" commands on provisioner tasks, but we no longer believe that this warrants a warning, as the affected user could simply run "pvc cluster task" to validate or resume the watcher.	2023-11-16 02:15:01 -05:00
Joshua M. Boniface	d727764ebc	Remove obsolete status and add cluster task Removes the obsoleted "pvc provisioner status" command and replaces it with a generalized "pvc cluster task" command to show all currently-active or pending tasks on the cluster workers.	2023-11-16 02:13:26 -05:00
Joshua M. Boniface	484e6542c2	Port remaining tasks to new task handler Move the create_vm and run_benchmark tasks to use the new Celery subsystem, handlers, and wait command. Remove the obsolete, dedicated API endpoints. Standardize the CLI client and move the repeated handler code into a separate common function.	2023-11-16 02:00:23 -05:00
Joshua M. Boniface	aef38639cf	Rename pvcapid-worker to pvcworkerd	2023-11-15 20:31:39 -05:00
Joshua M. Boniface	5f1432ccdd	Fix memory allocation updates and add more debug Previously, we were assigning memalloc/memprov/vcpualloc during an earlier phase using the main d_domain list. I'm not sure exactly why, but this was throwing off stats after a fence. Instead, set these values later on while parsing the actually-active VMs.	2023-11-10 10:29:32 -05:00
Joshua M. Boniface	d6b8808448	Clean up fencing handler 1. Remove all format strings in favour of f-strings 2. Ensure all logger messages have a prefix 3. Add a few more logger messages for clarity	2023-11-10 10:09:54 -05:00
Joshua M. Boniface	83c4c6633d	Readd RBD lock detection and clearing on startup This is still needed due to the nature of the locks and freeing them on startup, and to preserve lock=fail behaviour on VM startup. Also fixes the fencing lock flush to directly use the client library outside of Celery. I don't like this hack but it seems prudent until we move fencing to the workers as well.	2023-11-10 01:33:48 -05:00
Joshua M. Boniface	2a9bc632fa	Add node monitoring plugin for KeyDB/Redis	2023-11-10 00:56:46 -05:00
Joshua M. Boniface	b5e4c52387	Increase worker concurrency to 3	2023-11-10 00:39:42 -05:00
Joshua M. Boniface	b522306f87	Increase Celery wait times It's a bit inefficient, but provides nicer output and a bit of settling time between each stage.	2023-11-09 23:54:05 -05:00
Joshua M. Boniface	07026efb63	Ensure OSD checks in before completing Avoids issues where the new OSD doesn't check in; at least the administrator will know. Also fixes some issues with osd_db in removal.	2023-11-09 23:51:05 -05:00
Joshua M. Boniface	d7ea705e31	Improve waiter output Add an extra newline, show the name of the task (from start()), and show the first step as a "Gathering information" message on the progressbar.	2023-11-09 23:28:18 -05:00
Joshua M. Boniface	08411708f6	Clean up dangling references to cmd pipes Also removes the schema references for these CMD pipes as they are no longer required.	2023-11-09 23:28:14 -05:00
Joshua M. Boniface	ce17c60a20	Port OSD on-node tasks to Celery worker system Adds Celery versions of the osd_add, osd_replace, osd_refresh, osd_remove, and osd_db_vg_add functions.	2023-11-09 23:28:08 -05:00
Joshua M. Boniface	89681d54b9	Port VM on-node tasks to Celery worker system Adds Celery versions of the flush_locks, device_attach, and device_detach functions.	2023-11-06 20:40:46 -05:00
Joshua M. Boniface	f0c2e9d295	Don't start pvcapid-worker on primary It will be running anyways	2023-11-05 19:44:00 -05:00
Joshua M. Boniface	2c15036f86	Add KeyDB to node startup services Also ensure API worker starts on all nodes, not just coordinators.	2023-11-05 19:26:38 -05:00
Joshua M. Boniface	42ed6f6420	Remove redis as a dependency	2023-11-05 18:23:34 -05:00
Joshua M. Boniface	3dc1f57de2	Revert "Switch to ZK+PG over Redis for Celery queue" This reverts commit `54215bab6c`.	2023-11-05 17:10:46 -05:00
Joshua M. Boniface	b99b4e64b2	Ensure store path is passed properly	2023-11-05 16:48:47 -05:00
Joshua M. Boniface	91af1175ef	Fix missing CLI_CONFIG in echo()	2023-11-04 15:17:50 -04:00
Joshua M. Boniface	af8a8d969e	Ensure queues are set up for non-coordinator nodes Allows a runner to operate on every possible node, not just coordinators, as OSDs or other things could be on any node. Also add more comments.	2023-11-04 15:05:07 -04:00
Joshua M. Boniface	a6caac1b78	Add Celery queue routing for tasks By default, tasks will continue to run as they did, on the primary coordinator's task runner. However this opens the possibility for defining more tasks that will run on other nodes or coordinators.	2023-11-04 14:29:59 -04:00
Joshua M. Boniface	30d7e49401	Start API worker with node daemon on coordinators	2023-11-04 13:08:16 -04:00
Joshua M. Boniface	ab629f6b51	Use per-host hostname and queues in worker Opens up the ability to direct tasks to specific workers.	2023-11-04 13:02:30 -04:00
Joshua M. Boniface	54215bab6c	Switch to ZK+PG over Redis for Celery queue Redis did not provide a distributed solution for the worker, which precluded several important planned functions. So instead, move to using Zookeeper + PostgreSQL as the broker and result backend respectively. Should be a seamless drop-in change but for future uses requires the database host to be the primary coordinator IP rather than localhost, so that writes can occur to the database from non-primary hosts.	2023-11-04 12:46:34 -04:00
Joshua M. Boniface	7490f13b7c	Check for partition tables on new devices	2023-11-04 03:13:58 -04:00
Joshua M. Boniface	d1602f35de	Adjust split indicator	2023-11-04 02:56:21 -04:00
Joshua M. Boniface	7cdedde2fb	Adjust wording about extdb	2023-11-04 02:54:25 -04:00
Joshua M. Boniface	ab156b14b7	Update help messages for OSD refresh	2023-11-04 02:47:04 -04:00
Joshua M. Boniface	a016337f57	Remove block verify in APi This doesn't work right and is handled by the node anyways.	2023-11-04 02:45:10 -04:00
Joshua M. Boniface	e32054be81	Refactor refresh as well	2023-11-04 02:44:52 -04:00
Joshua M. Boniface	18d32fede3	Fix wording of detect strings	2023-11-04 01:37:07 -04:00
Joshua M. Boniface	b3d13fe9be	Add log message for zap	2023-11-04 01:02:51 -04:00
Joshua M. Boniface	48b2ccbd95	Add timeout for safe-to-destroy Continuously take the OSD down and out while doing so.	2023-11-04 00:55:05 -04:00
Joshua M. Boniface	1535078842	Fix lvremove, lvcreate, and update ZK details	2023-11-04 00:30:14 -04:00
Joshua M. Boniface	0e45613634	Use right key with correct data	2023-11-04 00:02:00 -04:00
Joshua M. Boniface	75135f6d5f	Avoid broken output format for new OSDs	2023-11-03 23:54:10 -04:00
Joshua M. Boniface	7f5dd385b5	Use right key for FSID elsewhere	2023-11-03 23:51:01 -04:00
Joshua M. Boniface	befce62925	Add OSD destroy before purge	2023-11-03 23:44:27 -04:00

... 5 6 7 8 9 ...

3246 Commits