365 Commits

Author SHA1 Message Date
9aee2a9075 Bump version to 0.9.84 2023-12-09 23:05:40 -05:00
4ca2381077 Rework metrics output and add combined endpoint 2023-12-09 15:47:40 -05:00
a70c1d63b0 Separate state totals from states, separate states 2023-12-09 13:59:17 -05:00
fd717b702d Use external list of fault states 2023-12-09 12:51:41 -05:00
132cde5591 Add totals and nice-format states
Avoids tons of annoying rewriting in the UI later.
2023-12-09 12:50:19 -05:00
ba565ead4c Report all state combinations in Prom metrics
Ensures that every state combination is always shown to metrics, even if
it contains 0 entries.
2023-12-09 12:40:37 -05:00
2b8abea8df Remove debug printing 2023-12-09 12:22:36 -05:00
9b3c9f1be5 Add Ceph metrics proxy and health fault counts 2023-12-09 12:22:36 -05:00
7373bfed3f Add Prometheus metric exporter
Adds a "fake" Prometheus metrics endpoint which returns cluster status
information in Prometheus format.
2023-12-09 12:22:36 -05:00
0bda095571 Move libvirt_schema and fix other imports 2023-12-09 12:20:29 -05:00
5a7ea25266 Fix incorrect database name entries 2023-12-09 12:12:00 -05:00
20acf3295f Add mass ack/delete of faults 2023-12-06 13:59:39 -05:00
672e58133f Implement interfaces to faults 2023-12-04 01:37:54 -05:00
988de1218f Bump version to 0.9.83 2023-12-01 17:37:42 -05:00
102c3c3106 Port all Celery worker functions to discrete pkg
Moves all tasks run by the Celery worker into a discrete package/module
for easier installation. Also adjusts several parameters throughout to
accomplish this.
2023-11-30 02:24:54 -05:00
0c0fb65c62 Rework Flask API to route Celery tasks manually
Avoids needing to define any of these tasks here; they can all be
defined in the pvcworkerd code.
2023-11-30 00:40:09 -05:00
03a738f878 Move config parser into daemon_lib
And reformat/add config values for API.
2023-11-30 00:05:37 -05:00
647cba3cf5 Expand startup width for new daemon name 2023-11-29 21:21:51 -05:00
c8f4cbb39e Fix node entry keys 2023-11-27 13:24:01 -05:00
786fae7769 Improve logo output 2023-11-27 13:01:43 -05:00
17f81e8296 Refactor pvcapid to use new configuration 2023-11-27 12:49:26 -05:00
460a2dd09f Bump version to 0.9.82 2023-11-25 15:38:50 -05:00
24cabd3b99 Fix missing result_backend on Debian 10/11
For whatever reason, a Celery worker on <5.2.x was not picking these up.
Move them back to the root of the module so they are properly picked up
on these older versions but still prevents calling the routing functions
during an API doc generation.
2023-11-25 15:35:25 -05:00
3e001b08b6 Bump version to 0.9.81 2023-11-17 01:29:41 -05:00
b66cfb07d8 Isolate cluster-dependent Celery startup
Avoids calling unworkable functions when generating API docs etc. by
isolating them into a Celery startup function called by Daemon.py.

Also update to Celery 4+ settings format.
2023-11-16 20:32:29 -05:00
9885914abd Remove stray periods from messages 2023-11-16 19:56:24 -05:00
e8da3714c0 Convert benchmark to use new Celery step structure 2023-11-16 19:36:23 -05:00
4d23d0419c Fix total stage count 2023-11-16 18:41:43 -05:00
c1c22c81e7 Ensure script cleanup is done in chroot 2023-11-16 18:27:23 -05:00
712a50ca27 Avoid use of fail here
It causes a reraise with a bunch of extra entries that we don't need.
2023-11-16 18:22:59 -05:00
823ce8cbf2 Remove duplicate cleanups 2023-11-16 18:19:05 -05:00
fca02238d7 Adjust starting text 2023-11-16 18:06:31 -05:00
73a4795967 Avoid fail during yields
This just causes a double-exception, so don't do it.
2023-11-16 17:22:53 -05:00
2a637c62e8 Port provisioner scripts to updated framework
Updates all the example provisioner scripts to use the new functions
exposed by the VMBuilder class as an illustration of how best to use
them.

Also adds a wrapper fail() handler to ensure the cleanup of the script,
as well as the global cleanup, are run on an exception.
2023-11-16 17:04:46 -05:00
618a1c1c10 Add helper functions to VMBuilder instances 2023-11-16 16:17:17 -05:00
f50f170d4e Convert vmbuilder to use new Celery step structure 2023-11-16 16:08:49 -05:00
9ab505ec98 Return and show task_name 2023-11-16 14:50:02 -05:00
0cb81f96e6 Use custom task IDs for Celery tasks
Full UUIDs were obnoxiously long, so switch to using just the first
8-character section of a UUID instead. Keeps the list nice and short,
makes them easier to copy, and is just generally nicer.

Could this cause uniqueness problems? Perhaps, but I don't see that
happening nearly frequently enough to matter.
2023-11-16 13:22:14 -05:00
d226e9f4e5 Enable extended Celery results 2023-11-16 12:02:57 -05:00
fa361a55d9 Explicitly use kwargs in Celery task calls 2023-11-16 11:55:30 -05:00
262babc63d Use kwargs for all task arguments
This will help ensure that the CLI frontend can properly parse the args
in a consistent way.
2023-11-16 10:10:48 -05:00
289049d223 Properly handle a "primary" run_on value 2023-11-16 02:49:29 -05:00
0bec6abe71 Return proper run_on for ported tasks 2023-11-16 02:28:57 -05:00
484e6542c2 Port remaining tasks to new task handler
Move the create_vm and run_benchmark tasks to use the new Celery
subsystem, handlers, and wait command. Remove the obsolete, dedicated
API endpoints.

Standardize the CLI client and move the repeated handler code into a
separate common function.
2023-11-16 02:00:23 -05:00
ce17c60a20 Port OSD on-node tasks to Celery worker system
Adds Celery versions of the osd_add, osd_replace, osd_refresh,
osd_remove, and osd_db_vg_add functions.
2023-11-09 23:28:08 -05:00
89681d54b9 Port VM on-node tasks to Celery worker system
Adds Celery versions of the flush_locks, device_attach, and
device_detach functions.
2023-11-06 20:40:46 -05:00
3dc1f57de2 Revert "Switch to ZK+PG over Redis for Celery queue"
This reverts commit 54215bab6c1c420bc284160c6c4633e0ab2e8ff2.
2023-11-05 17:10:46 -05:00
af8a8d969e Ensure queues are set up for non-coordinator nodes
Allows a runner to operate on every possible node, not just
coordinators, as OSDs or other things could be on any node.

Also add more comments.
2023-11-04 15:05:07 -04:00
a6caac1b78 Add Celery queue routing for tasks
By default, tasks will continue to run as they did, on the primary
coordinator's task runner. However this opens the possibility for
defining more tasks that will run on other nodes or coordinators.
2023-11-04 14:29:59 -04:00
54215bab6c Switch to ZK+PG over Redis for Celery queue
Redis did not provide a distributed solution for the worker, which
precluded several important planned functions. So instead, move to using
Zookeeper + PostgreSQL as the broker and result backend respectively.

Should be a seamless drop-in change but for future uses requires the
database host to be the primary coordinator IP rather than localhost, so
that writes can occur to the database from non-primary hosts.
2023-11-04 12:46:34 -04:00