Commit Graph

357 Commits

Author SHA1 Message Date
Joshua Boniface 7373bfed3f Add Prometheus metric exporter
Adds a "fake" Prometheus metrics endpoint which returns cluster status
information in Prometheus format.
2023-12-09 12:22:36 -05:00
Joshua Boniface 0bda095571 Move libvirt_schema and fix other imports 2023-12-09 12:20:29 -05:00
Joshua Boniface 5a7ea25266 Fix incorrect database name entries 2023-12-09 12:12:00 -05:00
Joshua Boniface 20acf3295f Add mass ack/delete of faults 2023-12-06 13:59:39 -05:00
Joshua Boniface 672e58133f Implement interfaces to faults 2023-12-04 01:37:54 -05:00
Joshua Boniface 988de1218f Bump version to 0.9.83 2023-12-01 17:37:42 -05:00
Joshua Boniface 102c3c3106 Port all Celery worker functions to discrete pkg
Moves all tasks run by the Celery worker into a discrete package/module
for easier installation. Also adjusts several parameters throughout to
accomplish this.
2023-11-30 02:24:54 -05:00
Joshua Boniface 0c0fb65c62 Rework Flask API to route Celery tasks manually
Avoids needing to define any of these tasks here; they can all be
defined in the pvcworkerd code.
2023-11-30 00:40:09 -05:00
Joshua Boniface 03a738f878 Move config parser into daemon_lib
And reformat/add config values for API.
2023-11-30 00:05:37 -05:00
Joshua Boniface 647cba3cf5 Expand startup width for new daemon name 2023-11-29 21:21:51 -05:00
Joshua Boniface c8f4cbb39e Fix node entry keys 2023-11-27 13:24:01 -05:00
Joshua Boniface 786fae7769 Improve logo output 2023-11-27 13:01:43 -05:00
Joshua Boniface 17f81e8296 Refactor pvcapid to use new configuration 2023-11-27 12:49:26 -05:00
Joshua Boniface 460a2dd09f Bump version to 0.9.82 2023-11-25 15:38:50 -05:00
Joshua Boniface 24cabd3b99 Fix missing result_backend on Debian 10/11
For whatever reason, a Celery worker on <5.2.x was not picking these up.
Move them back to the root of the module so they are properly picked up
on these older versions but still prevents calling the routing functions
during an API doc generation.
2023-11-25 15:35:25 -05:00
Joshua Boniface 3e001b08b6 Bump version to 0.9.81 2023-11-17 01:29:41 -05:00
Joshua Boniface b66cfb07d8 Isolate cluster-dependent Celery startup
Avoids calling unworkable functions when generating API docs etc. by
isolating them into a Celery startup function called by Daemon.py.

Also update to Celery 4+ settings format.
2023-11-16 20:32:29 -05:00
Joshua Boniface 9885914abd Remove stray periods from messages 2023-11-16 19:56:24 -05:00
Joshua Boniface e8da3714c0 Convert benchmark to use new Celery step structure 2023-11-16 19:36:23 -05:00
Joshua Boniface 4d23d0419c Fix total stage count 2023-11-16 18:41:43 -05:00
Joshua Boniface c1c22c81e7 Ensure script cleanup is done in chroot 2023-11-16 18:27:23 -05:00
Joshua Boniface 712a50ca27 Avoid use of fail here
It causes a reraise with a bunch of extra entries that we don't need.
2023-11-16 18:22:59 -05:00
Joshua Boniface 823ce8cbf2 Remove duplicate cleanups 2023-11-16 18:19:05 -05:00
Joshua Boniface fca02238d7 Adjust starting text 2023-11-16 18:06:31 -05:00
Joshua Boniface 73a4795967 Avoid fail during yields
This just causes a double-exception, so don't do it.
2023-11-16 17:22:53 -05:00
Joshua Boniface 2a637c62e8 Port provisioner scripts to updated framework
Updates all the example provisioner scripts to use the new functions
exposed by the VMBuilder class as an illustration of how best to use
them.

Also adds a wrapper fail() handler to ensure the cleanup of the script,
as well as the global cleanup, are run on an exception.
2023-11-16 17:04:46 -05:00
Joshua Boniface 618a1c1c10 Add helper functions to VMBuilder instances 2023-11-16 16:17:17 -05:00
Joshua Boniface f50f170d4e Convert vmbuilder to use new Celery step structure 2023-11-16 16:08:49 -05:00
Joshua Boniface 9ab505ec98 Return and show task_name 2023-11-16 14:50:02 -05:00
Joshua Boniface 0cb81f96e6 Use custom task IDs for Celery tasks
Full UUIDs were obnoxiously long, so switch to using just the first
8-character section of a UUID instead. Keeps the list nice and short,
makes them easier to copy, and is just generally nicer.

Could this cause uniqueness problems? Perhaps, but I don't see that
happening nearly frequently enough to matter.
2023-11-16 13:22:14 -05:00
Joshua Boniface d226e9f4e5 Enable extended Celery results 2023-11-16 12:02:57 -05:00
Joshua Boniface fa361a55d9 Explicitly use kwargs in Celery task calls 2023-11-16 11:55:30 -05:00
Joshua Boniface 262babc63d Use kwargs for all task arguments
This will help ensure that the CLI frontend can properly parse the args
in a consistent way.
2023-11-16 10:10:48 -05:00
Joshua Boniface 289049d223 Properly handle a "primary" run_on value 2023-11-16 02:49:29 -05:00
Joshua Boniface 0bec6abe71 Return proper run_on for ported tasks 2023-11-16 02:28:57 -05:00
Joshua Boniface 484e6542c2 Port remaining tasks to new task handler
Move the create_vm and run_benchmark tasks to use the new Celery
subsystem, handlers, and wait command. Remove the obsolete, dedicated
API endpoints.

Standardize the CLI client and move the repeated handler code into a
separate common function.
2023-11-16 02:00:23 -05:00
Joshua Boniface ce17c60a20 Port OSD on-node tasks to Celery worker system
Adds Celery versions of the osd_add, osd_replace, osd_refresh,
osd_remove, and osd_db_vg_add functions.
2023-11-09 23:28:08 -05:00
Joshua Boniface 89681d54b9 Port VM on-node tasks to Celery worker system
Adds Celery versions of the flush_locks, device_attach, and
device_detach functions.
2023-11-06 20:40:46 -05:00
Joshua Boniface 3dc1f57de2 Revert "Switch to ZK+PG over Redis for Celery queue"
This reverts commit 54215bab6c.
2023-11-05 17:10:46 -05:00
Joshua Boniface af8a8d969e Ensure queues are set up for non-coordinator nodes
Allows a runner to operate on every possible node, not just
coordinators, as OSDs or other things could be on any node.

Also add more comments.
2023-11-04 15:05:07 -04:00
Joshua Boniface a6caac1b78 Add Celery queue routing for tasks
By default, tasks will continue to run as they did, on the primary
coordinator's task runner. However this opens the possibility for
defining more tasks that will run on other nodes or coordinators.
2023-11-04 14:29:59 -04:00
Joshua Boniface 54215bab6c Switch to ZK+PG over Redis for Celery queue
Redis did not provide a distributed solution for the worker, which
precluded several important planned functions. So instead, move to using
Zookeeper + PostgreSQL as the broker and result backend respectively.

Should be a seamless drop-in change but for future uses requires the
database host to be the primary coordinator IP rather than localhost, so
that writes can occur to the database from non-primary hosts.
2023-11-04 12:46:34 -04:00
Joshua Boniface 64e37ae963 Update OSD replacement functionality
1. Simplify this by leveraging the existing remove_osd/add_osd
functions, since its task was functionally identical to those two in
sequential order.
2. Add support for split OSDs within the command (replacing all OSDs on
the block device(s) as required).
3. Add additional configurability and flexibility around the old device,
weight, and external DB LVs.
2023-11-03 01:45:49 -04:00
Joshua Boniface 980ea6a9e9 Adjust handling of ext_db and _count options
Avoid the use of superfluous flag options, default them to none, and add
support for fixed-size DB LVs.
2023-11-02 13:29:47 -04:00
Joshua Boniface 526a5f4a74 Add support for split OSD adds
Allows creating multiple OSDs on a single (NVMe) block device,
leveraging the "ceph-volume lvm batch" command. Replaces the previous
method of creating OSDs.

Also adds a new ZK item for each OSD indicating if it is split or not.
2023-11-01 21:31:35 -04:00
Joshua Boniface 5b4dd61754 Bump version to 0.9.80 2023-10-27 09:56:31 -04:00
Joshua Boniface 221af3f241 Bump version to 0.9.79 2023-10-24 02:10:24 -04:00
Joshua Boniface c87736eb0a Use consistent path name and format 2023-10-24 01:20:44 -04:00
Joshua Boniface 63d0a85e29 Add backup deletion command 2023-10-24 01:18:27 -04:00
Joshua Boniface 55ca131c2c Handle snapshots on restore and provide options
Also rename the retain option to remove superfluous plural.
2023-10-24 00:25:06 -04:00