Commit Graph

448 Commits

Author SHA1 Message Date
Joshua Boniface 41cd34ba4d Allow specifying job names for benchmarks 2024-09-18 14:55:12 -04:00
Joshua Boniface 736762901c Update benchmarks to include resource utilization
Adds additional polled information on node cpu, memory, and network
bandwidth for the node running the test. This should provide additional
useful information about the results of the test.

Also bumps the test format to 2 to ensure clients can handle the changes
properly.
2024-09-18 14:32:03 -04:00
Joshua Boniface 2de999c700 Add total cluster utilization stats
Useful for evaluating the cluster resources as a whole.
2024-09-05 16:05:33 -04:00
Joshua Boniface 7543eb839d Add dedicated volume scan endpoint
Allows an imported volume to be scanned for stats independently.

Designed to be used as part of a snapshot import via API, to allow the
"create" to happen before the real import (to check for available space,
etc.) and then run this import after when the RBD volume actually
exists.
2024-09-03 20:32:27 -04:00
Joshua Boniface 0f578d7c7d Ensure decimals are captured from size regex 2024-08-30 10:51:41 -04:00
Joshua Boniface f87b96887c Add detect string parser with nvme
Some newer servers do not report NVMe device paths properly using
`lsscsi` as expected. To work around this, add an `nvme`-based detect
parser that is called if the `lsscsi` parser returns a `-` (or None).
2024-08-30 10:41:56 -04:00
Joshua Boniface 8177d5f8b7 Use absolute path for ZK schema 2024-08-27 09:40:24 -04:00
Joshua Boniface f57b8d4a15 Simplify Celery event handling
It was far too cumbersome to report every possible stage here in a
consistent way. Realistically, this command will be run silently from
cron 99.95% of the time, so all this overcomplexity to handle individual
Celery state updates just isn't worth it.
2024-08-25 21:59:12 -04:00
Joshua Boniface e938140414 Refactor autobackups to make more sense 2024-08-25 19:21:00 -04:00
Joshua Boniface 4ef5fbdbe8 Restore previous autobackup continue behaviour
With the original system, the failure of one VM's backups would not
trigger a total fault, thus allowing other backups to complete.
Restore that behaviour.
2024-08-25 17:04:43 -04:00
Joshua Boniface f7926726f2 Adjust snapshot name again 2024-08-25 16:20:59 -04:00
Joshua Boniface a957218976 Fix staging for summary report 2024-08-25 16:11:35 -04:00
Joshua Boniface 61365e6e01 Adjust autobackup snap name and output messages 2024-08-25 16:09:52 -04:00
Joshua Boniface 35fe16ce75 Revert "Adjust stage naming to reflect autobackup stages"
This reverts commit c1f320ede2.
2024-08-25 15:58:25 -04:00
Joshua Boniface c1f320ede2 Adjust stage naming to reflect autobackup stages 2024-08-25 15:55:16 -04:00
Joshua Boniface f1668bffcc Refactor autobackups to implement vm.worker defs
Avoid trying to subcall other Celery worker tasks, as this just gets
very screwy with the stages. Instead reimplement what is needed directly
here. While this does cause a fair bit of code duplication, I believe
the resulting clarity is worthwhile.
2024-08-25 15:54:03 -04:00
Joshua Boniface c0686fc5c7 Remove stage overrides
These aren't needed after pending refactor.
2024-08-25 15:17:46 -04:00
Joshua Boniface 4b37c4fea3 Fix assignment bug 2024-08-25 14:10:59 -04:00
Joshua Boniface 0d918d66fe Port VM autobackups into pvcworkerd with snaps
Moves VM autobackups from being in-CLI to being handled by the
pvcworkerd system on the primary coordinator. Turns the CLI autobackup
command into an actual API client endpoint rather than having its logic
in the CLI.

In addition, modifies the new autobackup to leverage the new "pvc vm
snapshot" function set, just with special snapshot names. This helps
automate this within the new snapshot scaffolding.
2024-08-23 17:23:06 -04:00
Joshua Boniface f6c009beac Allow overriding stages in some commands
This allows them to be called by autobackup commands while still
preserving the current Celery report flow.
2024-08-23 11:21:02 -04:00
Joshua Boniface fc89f4f2f5 Fix error message contents 2024-08-23 10:23:51 -04:00
Joshua Boniface 565011b277 Set snapshot name before start 2024-08-20 23:01:52 -04:00
Joshua Boniface 0bf9cc6b06 Improve stage handling
Run start() at the beginning, and leverage the new tweaks to the CLI to
update the total steps later. Allows errors to be handled gracefully
2024-08-20 17:50:27 -04:00
Joshua Boniface f2dfada73e Improve return handling for snapshot tasks 2024-08-20 17:40:44 -04:00
Joshua Boniface 9b3075be18 Add UUID check and fix wording
Don't suggest renaming any more as it's not enough.
2024-08-20 17:05:27 -04:00
Joshua Boniface 9a661d0173 Convert VM snapshots to worker tasks
Improves manageability and offloads these from the API context.
2024-08-20 16:50:41 -04:00
Joshua Boniface 4a0680b27f Fix issues with snapshot imports 2024-08-20 13:59:05 -04:00
Joshua Boniface f42a1bad0e Allow passing zk_only into VM snapshot creation 2024-08-20 12:57:53 -04:00
Joshua Boniface 3fb52a13c2 Add missing VM states from snapshots 2024-08-20 11:53:57 -04:00
Joshua Boniface 8937ddf331 Simplify VM rename to preserve data
A rename is simply a change to two values, so instead of undefining and
re-defining the VM, just edit those two fields. This ensures things like
snapshots are preserved automatically.
2024-08-20 11:37:28 -04:00
Joshua Boniface 7cc354466f Finish implementing snapshot import 2024-08-20 11:25:09 -04:00
Joshua Boniface 0a8bad3418 Add VM snapshot import 2024-08-20 10:53:56 -04:00
Joshua Boniface f10d32987b Fix up comments 2024-08-20 10:37:58 -04:00
Joshua Boniface d060787503 Add initial implementation of snapshot export 2024-08-19 18:46:07 -04:00
Joshua Boniface f1b4593367 Store current stats with snapshots
Allows getting info like size, etc. for the snapshot.
2024-08-19 14:07:27 -04:00
Joshua Boniface 33f905459a Implement VM rollback
Closes #184
2024-08-16 10:47:18 -04:00
Joshua Boniface 359191c83f Ensure snapshot name does not already exist 2024-08-16 10:46:25 -04:00
Joshua Boniface 3d0d5e63f6 Make default snap name just the datestring 2024-08-16 10:46:25 -04:00
Joshua Boniface e6bfbb6d45 Actually fix incorrect naming bug 2024-08-16 10:46:25 -04:00
Joshua Boniface b80f9e28dc Add human-readable age to snapshots
This is parsed server-side for consistent timing and to simplify the API
consumers.
2024-08-16 10:46:25 -04:00
Joshua Boniface fbd5b3cca3 Remove is_backup flag for snapshots
This won't be needed for anything.
2024-08-16 10:46:25 -04:00
Joshua Boniface 2b1082590e Fix bug in snapshot removal 2024-08-16 10:46:25 -04:00
Joshua Boniface 6fc7f45027 Add snapshot lists and timestamp
Adds snapshots to the list of data in VM objects
2024-08-16 10:46:25 -04:00
Joshua Boniface 0c240a5129 Add VM snapshot removal 2024-08-16 10:46:25 -04:00
Joshua Boniface 553c1e670e Add VM snapshots functionality
Adds the ability to create snapshots of an entire VM, including all its
RBD disks and the VM XML config, though not any PVC metadata.
2024-08-16 10:46:25 -04:00
Joshua Boniface 942de9f15b Add better exception handling for XML configs 2024-08-16 10:46:04 -04:00
Joshua Boniface c186015d6f Add check for invalid profile 2024-07-13 17:13:40 -04:00
Joshua Boniface 7a99e0e524 Fix bugs listing snapshots by pool/volume
The logic of this didn't work, so reconfigure to use these like limits.
Also fixes a bug in the upper getCephVolumes for invalid pools.
2024-05-16 16:32:22 -04:00
Joshua Boniface 5d0e7931d1 Add support for rolling back snapshots
We supported creating snapshots, but not doing anything with them. This
removes the manual task of restoring a snapshot and replace it with a
PVC abstraction of rolling back to a snapshot.

While Ceph recommends cloning a snapshot instead of rolling back, due to
the time taken, in our usecase I don't think that is an optimal
strategy, as it will leave dangling clones that we'd then have to
manage.

Closes #183
2024-05-13 15:24:51 -04:00
Joshua Boniface ab944f9b95 Add RBD snap purge during volume removal
Fixes #180
2024-04-19 10:31:11 -04:00