Commit Graph

497 Commits

Author SHA1 Message Date
Joshua Boniface a95e72008e Add size validations for volume clones
Adds the same validations as a volume add or resize to volume clones, to
ensure there is enough free space for them.
2024-02-02 11:37:29 -05:00
Joshua Boniface efc7434143 Add safety check for 80% full size
Adds a check that a volume creation or resize won't violate the 80% full
rule for the storage cluster. This ensures a cluster won't get too full
if a storage volume fills up.

Also adds a force flag throughout the pipeline to override this check,
should an administrator really want to do so.

Closes #177
2024-02-02 11:37:00 -05:00
Joshua Boniface 18f09196be Bump version to 0.9.93 2024-01-30 09:51:21 -05:00
Joshua Boniface df40b779af Bump version to 0.9.92 2024-01-29 09:39:10 -05:00
Joshua Boniface a66449541d Improve script error handling and variables 2024-01-26 15:41:34 -05:00
Joshua Boniface f29b4c2755 Bump version to 0.9.91 2024-01-23 10:40:59 -05:00
Joshua Boniface 94e0287fc4 Add missing modules to default cloud-init 2024-01-21 14:23:44 -05:00
Joshua Boniface 86ca363697 Bump version to 0.9.90 2024-01-11 10:22:48 -05:00
Joshua Boniface 39ec427c42 Remove obsolete swagger files 2024-01-11 10:20:42 -05:00
Joshua Boniface 1ba21312ea Use flask manage for all versions
The legacy method does not work any longer, so just use the Flask manage
script and remove the obsolete legacy one.
2024-01-11 10:20:33 -05:00
Joshua Boniface 09269f182c Add live migrate max downtime selector meta field
Adds a new flag to VM metadata to allow setting the VM live migration
max downtime. This will enable very busy VMs that hang live migration to
have this value changed.
2024-01-11 00:05:50 -05:00
Joshua Boniface e9b6072fa0 Bump version to 0.9.89 2024-01-09 12:15:53 -05:00
Joshua Boniface 3c458ca9a6 Fix broken result backend on old Celery 2024-01-09 12:06:23 -05:00
Joshua Boniface 1d480f5629 Bump version to 0.9.88 2023-12-29 14:56:33 -05:00
Joshua Boniface 123c7ce857 Update copyright header on all files for 2024
Last release of 2023 is probably the best time to do this.
2023-12-29 11:16:59 -05:00
Joshua Boniface 4969e90f8a Allow enable/disable of Prometheus endpoints
Since these are unauthenticated, it might be the case that an
administrator wishes to completely disable these metrics endpoints.
Provide that option via pvc.conf through pvc-ansible's existing
enable_prometheus_exporters option and the new enable_prometheus
configuration flag.

Defaults to "yes" to provide all functionality unless explicitly
disabled, as the author assumes that the PVC API is secured in other
ways as well and that metric information is not completely sensitive.
2023-12-29 09:25:10 -05:00
Joshua Boniface 0bcf8cfe19 Add Zookeeper metrics proxy 2023-12-28 13:53:15 -05:00
Joshua Boniface 8083b7a3e6 Bump version to 0.9.87 2023-12-27 13:40:51 -05:00
Joshua Boniface 3e4cc53fdd Add node network statistics and utilization values
Adds a new physical network interface stats parser to the node
keepalives, and leverages this information to provide a network
utilization overview in the Prometheus metrics.
2023-12-21 15:45:01 -05:00
Joshua Boniface 39f9f3640c Rename health metrics and add resource metrics 2023-12-21 09:40:49 -05:00
Joshua Boniface 0a93f526e0 Bump version to 0.9.86 2023-12-14 14:46:29 -05:00
Joshua Boniface 7c9512fb22 Fix broken config file in API migration script 2023-12-14 14:45:58 -05:00
Joshua Boniface ed9c37982a Move metric collection into daemon library 2023-12-11 19:20:30 -05:00
Joshua Boniface 44a4f0e1f7 Use new info detail output instead of new lists
Avoids multiple additional ZK calls by using data that is now in the
status detail output.
2023-12-10 22:19:09 -05:00
Joshua Boniface 9dc5097dbc Bump version to 0.9.85 2023-12-10 01:00:33 -05:00
Joshua Boniface 9aee2a9075 Bump version to 0.9.84 2023-12-09 23:05:40 -05:00
Joshua Boniface 4ca2381077 Rework metrics output and add combined endpoint 2023-12-09 15:47:40 -05:00
Joshua Boniface a70c1d63b0 Separate state totals from states, separate states 2023-12-09 13:59:17 -05:00
Joshua Boniface fd717b702d Use external list of fault states 2023-12-09 12:51:41 -05:00
Joshua Boniface 132cde5591 Add totals and nice-format states
Avoids tons of annoying rewriting in the UI later.
2023-12-09 12:50:19 -05:00
Joshua Boniface ba565ead4c Report all state combinations in Prom metrics
Ensures that every state combination is always shown to metrics, even if
it contains 0 entries.
2023-12-09 12:40:37 -05:00
Joshua Boniface 2b8abea8df Remove debug printing 2023-12-09 12:22:36 -05:00
Joshua Boniface 9b3c9f1be5 Add Ceph metrics proxy and health fault counts 2023-12-09 12:22:36 -05:00
Joshua Boniface 7373bfed3f Add Prometheus metric exporter
Adds a "fake" Prometheus metrics endpoint which returns cluster status
information in Prometheus format.
2023-12-09 12:22:36 -05:00
Joshua Boniface f01c12c86b Import from pvcworkerd not pvcapid 2023-12-09 12:22:19 -05:00
Joshua Boniface 0bda095571 Move libvirt_schema and fix other imports 2023-12-09 12:20:29 -05:00
Joshua Boniface 7976e1d2d0 Correct import location in scripts 2023-12-09 12:18:33 -05:00
Joshua Boniface 5a7ea25266 Fix incorrect database name entries 2023-12-09 12:12:00 -05:00
Joshua Boniface 20acf3295f Add mass ack/delete of faults 2023-12-06 13:59:39 -05:00
Joshua Boniface 672e58133f Implement interfaces to faults 2023-12-04 01:37:54 -05:00
Joshua Boniface 988de1218f Bump version to 0.9.83 2023-12-01 17:37:42 -05:00
Joshua Boniface 102c3c3106 Port all Celery worker functions to discrete pkg
Moves all tasks run by the Celery worker into a discrete package/module
for easier installation. Also adjusts several parameters throughout to
accomplish this.
2023-11-30 02:24:54 -05:00
Joshua Boniface 0c0fb65c62 Rework Flask API to route Celery tasks manually
Avoids needing to define any of these tasks here; they can all be
defined in the pvcworkerd code.
2023-11-30 00:40:09 -05:00
Joshua Boniface 03a738f878 Move config parser into daemon_lib
And reformat/add config values for API.
2023-11-30 00:05:37 -05:00
Joshua Boniface 647cba3cf5 Expand startup width for new daemon name 2023-11-29 21:21:51 -05:00
Joshua Boniface c8f4cbb39e Fix node entry keys 2023-11-27 13:24:01 -05:00
Joshua Boniface 786fae7769 Improve logo output 2023-11-27 13:01:43 -05:00
Joshua Boniface 17f81e8296 Refactor pvcapid to use new configuration 2023-11-27 12:49:26 -05:00
Joshua Boniface dab7396196 Move to unified pvc.conf configuration file 2023-11-26 14:16:21 -05:00
Joshua Boniface 460a2dd09f Bump version to 0.9.82 2023-11-25 15:38:50 -05:00
Joshua Boniface 24cabd3b99 Fix missing result_backend on Debian 10/11
For whatever reason, a Celery worker on <5.2.x was not picking these up.
Move them back to the root of the module so they are properly picked up
on these older versions but still prevents calling the routing functions
during an API doc generation.
2023-11-25 15:35:25 -05:00
Joshua Boniface 3e001b08b6 Bump version to 0.9.81 2023-11-17 01:29:41 -05:00
Joshua Boniface b66cfb07d8 Isolate cluster-dependent Celery startup
Avoids calling unworkable functions when generating API docs etc. by
isolating them into a Celery startup function called by Daemon.py.

Also update to Celery 4+ settings format.
2023-11-16 20:32:29 -05:00
Joshua Boniface 9885914abd Remove stray periods from messages 2023-11-16 19:56:24 -05:00
Joshua Boniface e8da3714c0 Convert benchmark to use new Celery step structure 2023-11-16 19:36:23 -05:00
Joshua Boniface 4d23d0419c Fix total stage count 2023-11-16 18:41:43 -05:00
Joshua Boniface c1c22c81e7 Ensure script cleanup is done in chroot 2023-11-16 18:27:23 -05:00
Joshua Boniface 712a50ca27 Avoid use of fail here
It causes a reraise with a bunch of extra entries that we don't need.
2023-11-16 18:22:59 -05:00
Joshua Boniface 815041ff20 Fix bugs when main installs fail 2023-11-16 18:20:26 -05:00
Joshua Boniface 823ce8cbf2 Remove duplicate cleanups 2023-11-16 18:19:05 -05:00
Joshua Boniface fca02238d7 Adjust starting text 2023-11-16 18:06:31 -05:00
Joshua Boniface 73a4795967 Avoid fail during yields
This just causes a double-exception, so don't do it.
2023-11-16 17:22:53 -05:00
Joshua Boniface 2a637c62e8 Port provisioner scripts to updated framework
Updates all the example provisioner scripts to use the new functions
exposed by the VMBuilder class as an illustration of how best to use
them.

Also adds a wrapper fail() handler to ensure the cleanup of the script,
as well as the global cleanup, are run on an exception.
2023-11-16 17:04:46 -05:00
Joshua Boniface 618a1c1c10 Add helper functions to VMBuilder instances 2023-11-16 16:17:17 -05:00
Joshua Boniface f50f170d4e Convert vmbuilder to use new Celery step structure 2023-11-16 16:08:49 -05:00
Joshua Boniface 9ab505ec98 Return and show task_name 2023-11-16 14:50:02 -05:00
Joshua Boniface 0cb81f96e6 Use custom task IDs for Celery tasks
Full UUIDs were obnoxiously long, so switch to using just the first
8-character section of a UUID instead. Keeps the list nice and short,
makes them easier to copy, and is just generally nicer.

Could this cause uniqueness problems? Perhaps, but I don't see that
happening nearly frequently enough to matter.
2023-11-16 13:22:14 -05:00
Joshua Boniface 3651885954 Add --events to workers 2023-11-16 12:35:54 -05:00
Joshua Boniface d226e9f4e5 Enable extended Celery results 2023-11-16 12:02:57 -05:00
Joshua Boniface fa361a55d9 Explicitly use kwargs in Celery task calls 2023-11-16 11:55:30 -05:00
Joshua Boniface 0d818017e8 Name the celery workers pvcworkerd@<hostname> 2023-11-16 11:43:17 -05:00
Joshua Boniface 262babc63d Use kwargs for all task arguments
This will help ensure that the CLI frontend can properly parse the args
in a consistent way.
2023-11-16 10:10:48 -05:00
Joshua Boniface 289049d223 Properly handle a "primary" run_on value 2023-11-16 02:49:29 -05:00
Joshua Boniface 0bec6abe71 Return proper run_on for ported tasks 2023-11-16 02:28:57 -05:00
Joshua Boniface 484e6542c2 Port remaining tasks to new task handler
Move the create_vm and run_benchmark tasks to use the new Celery
subsystem, handlers, and wait command. Remove the obsolete, dedicated
API endpoints.

Standardize the CLI client and move the repeated handler code into a
separate common function.
2023-11-16 02:00:23 -05:00
Joshua Boniface aef38639cf Rename pvcapid-worker to pvcworkerd 2023-11-15 20:31:39 -05:00
Joshua Boniface b5e4c52387 Increase worker concurrency to 3 2023-11-10 00:39:42 -05:00
Joshua Boniface ce17c60a20 Port OSD on-node tasks to Celery worker system
Adds Celery versions of the osd_add, osd_replace, osd_refresh,
osd_remove, and osd_db_vg_add functions.
2023-11-09 23:28:08 -05:00
Joshua Boniface 89681d54b9 Port VM on-node tasks to Celery worker system
Adds Celery versions of the flush_locks, device_attach, and
device_detach functions.
2023-11-06 20:40:46 -05:00
Joshua Boniface 3dc1f57de2 Revert "Switch to ZK+PG over Redis for Celery queue"
This reverts commit 54215bab6c.
2023-11-05 17:10:46 -05:00
Joshua Boniface af8a8d969e Ensure queues are set up for non-coordinator nodes
Allows a runner to operate on every possible node, not just
coordinators, as OSDs or other things could be on any node.

Also add more comments.
2023-11-04 15:05:07 -04:00
Joshua Boniface a6caac1b78 Add Celery queue routing for tasks
By default, tasks will continue to run as they did, on the primary
coordinator's task runner. However this opens the possibility for
defining more tasks that will run on other nodes or coordinators.
2023-11-04 14:29:59 -04:00
Joshua Boniface ab629f6b51 Use per-host hostname and queues in worker
Opens up the ability to direct tasks to specific workers.
2023-11-04 13:02:30 -04:00
Joshua Boniface 54215bab6c Switch to ZK+PG over Redis for Celery queue
Redis did not provide a distributed solution for the worker, which
precluded several important planned functions. So instead, move to using
Zookeeper + PostgreSQL as the broker and result backend respectively.

Should be a seamless drop-in change but for future uses requires the
database host to be the primary coordinator IP rather than localhost, so
that writes can occur to the database from non-primary hosts.
2023-11-04 12:46:34 -04:00
Joshua Boniface 64e37ae963 Update OSD replacement functionality
1. Simplify this by leveraging the existing remove_osd/add_osd
functions, since its task was functionally identical to those two in
sequential order.
2. Add support for split OSDs within the command (replacing all OSDs on
the block device(s) as required).
3. Add additional configurability and flexibility around the old device,
weight, and external DB LVs.
2023-11-03 01:45:49 -04:00
Joshua Boniface 980ea6a9e9 Adjust handling of ext_db and _count options
Avoid the use of superfluous flag options, default them to none, and add
support for fixed-size DB LVs.
2023-11-02 13:29:47 -04:00
Joshua Boniface 526a5f4a74 Add support for split OSD adds
Allows creating multiple OSDs on a single (NVMe) block device,
leveraging the "ceph-volume lvm batch" command. Replaces the previous
method of creating OSDs.

Also adds a new ZK item for each OSD indicating if it is split or not.
2023-11-01 21:31:35 -04:00
Joshua Boniface 5b4dd61754 Bump version to 0.9.80 2023-10-27 09:56:31 -04:00
Joshua Boniface 221af3f241 Bump version to 0.9.79 2023-10-24 02:10:24 -04:00
Joshua Boniface c87736eb0a Use consistent path name and format 2023-10-24 01:20:44 -04:00
Joshua Boniface 63d0a85e29 Add backup deletion command 2023-10-24 01:18:27 -04:00
Joshua Boniface 55ca131c2c Handle snapshots on restore and provide options
Also rename the retain option to remove superfluous plural.
2023-10-24 00:25:06 -04:00
Joshua Boniface 8d256a1737 Complete VM restore functionality 2023-10-23 22:23:17 -04:00
Joshua Boniface 4fc9b15652 Fix bad function name 2023-10-17 10:56:32 -04:00
Joshua Boniface b997c6f31e Add support for full VM backups
Adds support for exporting full VM backups, including configuration,
metainfo, and RBD disk images, with incremental support.
2023-10-17 10:15:06 -04:00
Joshua Boniface 522da3fd95 Adjust wording for volume create too 2023-10-03 09:42:23 -04:00
Joshua Boniface 3a1bf0724e Mention file_size as bytes 2023-10-03 09:39:19 -04:00
Joshua Boniface c6c44bf775 Bump version to 0.9.78 2023-09-30 12:57:55 -04:00
Joshua Boniface bbb940da65 Remove spurious comments 2023-09-30 12:37:58 -04:00
Joshua Boniface 35e27f79ef Fix uploading of non-raw image files
Adds a new API query parameter to define the file size, which is then
used for the temporary image. This is required for, at least VMDK, files
to work properly in qemu-img convert.
2023-09-29 16:19:22 -04:00