Commit Graph

333 Commits

Author SHA1 Message Date
Joshua Boniface a0b45a2bcd Always create RBDs with bytes value
Converting into human results in imprecise values when specifying bytes
directly, which in turn breaks VMDK image uploads. Instead, just use the
raw bytes value when creating the volume instead of converting it back.
2023-09-30 12:37:43 -04:00
Joshua Boniface c4397219da Ensure fencing states are properly reflected 2023-09-18 09:59:18 -04:00
Joshua Boniface 311bb69785 Format based on updated Black 2023-09-12 16:41:02 -04:00
Joshua Boniface 653b95ee25 Normalize return messages for node commands 2023-05-04 17:02:46 -04:00
Joshua Boniface 78322f4de4 Improve size handling during volume add/resize 2023-04-28 12:16:16 -04:00
Joshua Boniface c1782c5004 Add full/nearfull OSD health detection 2023-04-28 11:33:39 -04:00
Joshua Boniface e773211293 Add PVC version to cluster status output 2023-02-22 16:09:24 -05:00
Joshua Boniface 70ba364f1d Flip VM state condition to remove shutdown
Don't cause health degredation for shutdown state, and flip the list
around to make it clearer.
2023-02-16 20:32:33 -05:00
Joshua Boniface 1f8561d59a Format cluster health like node healths
Make a cleaner construct here.
2023-02-16 12:33:36 -05:00
Joshua Boniface 1093ca6264 Disallow health less than 0 2023-02-15 16:50:24 -05:00
Joshua Boniface 29584e5636 Add per-node health entries for 3rd party checks 2023-02-15 16:44:49 -05:00
Joshua Boniface f4e8449356 Fix bugs and formatting of health messages 2023-02-15 16:28:56 -05:00
Joshua Boniface ec79acf061 Fix linting of cluster.py file 2023-02-15 15:48:31 -05:00
Joshua Boniface 00586074cf Modify cluster health to use new values 2023-02-15 15:45:43 -05:00
Joshua Boniface f4eef30770 Add JSON health to cluster data 2023-02-15 15:26:57 -05:00
Joshua Boniface b07396c39a Fix bugs if plugins fail to load 2023-02-13 21:51:48 -05:00
Joshua Boniface e6f9e6e0e8 Fix several bugs and optimize output 2023-02-13 16:36:15 -05:00
Joshua Boniface 9c14d84bfc Add node health value and send out API 2023-02-13 15:53:39 -05:00
Joshua Boniface 3c742a827b Initial implementation of monitoring plugin system 2023-02-13 12:06:26 -05:00
Joshua Boniface 671a907236 Allow rename in disable state 2023-01-30 11:48:43 -05:00
Joshua Boniface 38d63d9837 Flip behaviour of memory selectors
It didn't make any sense to me for mem(prov) to be the default selector,
since this has too many caveats versus mem(free). Switch to using
mem(free) as the default (i.e. "mem") and make memprov the alternative.
2022-11-15 15:45:59 -05:00
Joshua Boniface 79eb994a5e Ensure equality of none and None for selector 2022-11-07 11:59:53 -05:00
Joshua Boniface 8af7189dd0 Add module tag for daemon lib 2022-11-04 03:47:18 -04:00
Joshua Boniface 726d0a562b Update copyright header year 2022-10-06 11:55:27 -04:00
Joshua Boniface 881550b610 Actually fix VM sorting
Due to the executor the previous attempt did not work.
2022-08-12 17:46:29 -04:00
Joshua Boniface bcabd7d079 Always sort VM list
Same justification as previous commit.
2022-08-09 12:05:40 -04:00
Joshua Boniface 05a316cdd6 Ensure the node list is sorted
Otherwise the node entries could come back in an arbitrary order; since
this is an ordered list of dictionaries that might not be expected by
the API consumers, so ensure it's always sorted.
2022-08-09 12:03:49 -04:00
Joshua Boniface d8d3feee22 Add selector help and adjust flag name
1. Add documentation on the node selector flags. In the API, reference
the daemon configuration manual which now includes details in this
section; in the CLI, provide the help in "pvc vm define" in detail and
then reference that command's help in the other commands that use this
field.

2. Ensure the naming is consistent in the CLI, using the flag name
"--node-selector" everywhere (was "--selector" for "pvc vm" commands and
"--node-selector" for "pvc provisioner" commands).
2022-06-10 02:42:06 -04:00
Joshua Boniface f8cdcb30ba Add migration selector via free memory
Closes #152
2022-05-18 03:47:16 -04:00
Joshua Boniface c401a1f655 Use consistent language for primary mode
I didn't call it "router" anywhere else, but the state in the list is
called "coordinator" so, call it "coordinator mode".
2022-05-06 15:40:52 -04:00
Joshua Boniface 7a40c7a55b Add support for replacing/refreshing OSDs
Adds commands to both replace an OSD disk, and refresh (reimport) an
existing OSD disk on a new node. This handles the cases where an OSD
disk should be replaced (either due to upgrades or failures) or where a
node is rebuilt in-place and an existing OSD must be re-imported to it.

This should avoid the need to do a full remove/add sequence for either
case.

Also cleans up some aspects of OSD removal that are identical between
methods (e.g. using safe-to-destroy and sleeping after stopping) and
fixes a bug if an OSD does not truly exist when the daemon starts up.
2022-05-06 15:32:06 -04:00
Joshua Boniface 464f0e0356 Store additional OSD information in ZK
Ensures that information like the FSIDs and the OSD LVM volume are
stored in Zookeeper at creation time and updated at daemon start time
(to ensure the data is populated at least once, or if the /dev/sdX
path changes).

This will allow safer operation of OSD removals and the potential
implementation of re-activation after node replacements.
2022-05-02 12:11:39 -04:00
Joshua Boniface d6ca74376a Fix bugs with forced removal 2022-04-29 14:03:07 -04:00
Joshua Boniface 4d698be34b Add OSD removal force option
Ensures a removal can continue even in situations where some step(s)
might fail, for instance removing an obsolete OSD from a replaced node.
2022-04-29 11:16:33 -04:00
Joshua Boniface 1142454934 Add pool PGs count modification
Allows an administrator to adjust the PG count of a given pool. This can
be used to increase the PGs (for example after adding more OSDs) or
decrease it (to remove OSDs, reduce CPU load, etc.).
2021-12-28 21:53:29 -05:00
Joshua Boniface bbfad340a1 Add PGs count to pool list 2021-12-28 21:12:02 -05:00
Joshua Boniface c73939e1c5 Fix issue if pool stats have not updated yet 2021-12-28 21:03:10 -05:00
Joshua Boniface 25fe45dd28 Add device class tiers to Ceph pools
Allows specifying a particular device class ("tier") for a given pool,
for instance SSD-only or NVMe-only. This is implemented with Crush
rules on the Ceph side, and via an additional new key in the pool
Zookeeper schema which is defaulted to "default".
2021-12-28 20:58:15 -05:00
Joshua Boniface 6ccd19e636 Standardize fuzzy matching and use fullmatch
Solves two problems:

1. How match fuzziness was used was very inconsistent; make them all the
same, i.e. "if is_fuzzy and limit, apply .* to both sides".

2. Use re.fullmatch instead of re.match to ensure exact matching of the
regex to the value. Without fuzziness, this would sometimes cause
inconsistent behavior, for instance if a limit was non-fuzzy "vm",
expecting to match the actual "vm", but also matching "vm1" too.
2021-12-06 16:35:29 -05:00
Joshua Boniface 0d857d5ab8 Use positive check rather than negative
Ensure the VM is start before doing shutdown/stop, rather than being
stopped. Prevents overwrite of existing disable state and other
weirdness.
2021-11-06 04:08:33 -04:00
Joshua Boniface 5f193a6134 Perform automatic shutdown/stop on VM disable
Instead of requiring the VM to already be stopped, instead allow disable
state changes to perform a shutdown first. Also add a force option which
will do a hard stop instead of a shutdown.

References #148
2021-11-06 03:57:24 -04:00
Joshua Boniface c41664d2da Reformat code with Black code formatter
Unify the code style along PEP and Black principles using the tool.
2021-11-06 03:02:43 -04:00
Joshua Boniface 87cda72ca9 Fix invalid schema key
Addresses #144
2021-10-09 18:42:33 -04:00
Joshua Boniface 24de0f4189 Add MTU to network creation/modification
Addresses #144
2021-10-09 17:51:32 -04:00
Joshua Boniface 50d8aa0586 Add handlers for client network MTUs
Refactors some of the code in VXNetworkInterface to handle MTUs in a
more streamlined fashion. Also fixes a bug whereby bridge client
networks were being explicitly given the cluster dev MTU which might not
be correct. Now adds support for this option explicitly in the configs,
and defaults to 1500 for safety (the standard Ethernet MTU).

Addresses #144
2021-10-09 17:02:27 -04:00
Joshua Boniface c0f7ba0125 Add limit negation to VM list
When using the "state", "node", or "tag" arguments to a VM list, add
support for a "negate" flag to look for all VMs *not in* the state,
node, or tag state.
2021-10-07 11:50:52 -04:00
Joshua Boniface 65df807b09 Add support for configurable OSD DB ratios
The default of 0.05 (5%) is likely ideal in the initial implementation,
but allow this to be set explicitly for maximum flexibility in
space-constrained or performance-critical use-cases.
2021-09-24 01:06:39 -04:00
Joshua Boniface adc8a5a3bc Add separate OSD DB device support
Adds in three parts:

1. Create an API endpoint to create OSD DB volume groups on a device.
Passed through to the node via the same command pipeline as
creating/removing OSDs, and creates a volume group with a fixed name
(osd-db).

2. Adds API support for specifying whether or not to use this DB volume
group when creating a new OSD via the "ext_db" flag. Naming and sizing
is fixed for simplicity and based on Ceph recommendations (5% of OSD
size). The Zookeeper schema tracks the block device to use during
removal.

3. Adds CLI support for the new and modified API endpoints, as well as
displaying the block device and DB block device in the OSD list.

While I debated supporting adding a DB device to an existing OSD, in
practice this ended up being a very complex operation involving stopping
the OSD and setting some options, so this is not supported; this can be
specified during OSD creation only.

Closes #142
2021-09-23 13:59:49 -04:00
Joshua Boniface 58db537093 Add memory and vCPU checks to VM define/modify
Ensures that a VM won't:

(a) Have provisioned more RAM than there is available on a given node.
Due to memory overprovisioning, this is simply a "is the VM memory count
more than the node count", and doesn't factor in free or used memory on
a node, total cluster usage, etc. So if a node has 64GB total RAM, the
VM limit is 64GB. It is up to an administrator to ensure sanity *below*
that value.

(b) Have provisioned more vCPUs than there are CPU cores on the node,
minus 2 to account for hypervisor/storage processes. Will ensure there
is no severe CPU contention caused by a single VM having more vCPUs than
there are actual execution threads available.

Closes #139
2021-09-13 01:51:21 -04:00
Joshua Boniface e71a6c90bf Add pool size check when resizing volumes
Closes #140
2021-09-12 19:54:51 -04:00
Joshua Boniface e962743e51 Add VM device hot attach/detach support
Adds a new API endpoint to support hot attach/detach of devices, and the
corresponding client-side logic to use this endpoint when doing VM
network/storage add/remove actions.

The live attach is now the default behaviour for these types of
additions and removals, and can be disabled if needed.

Closes #141
2021-09-12 19:33:00 -04:00
Joshua Boniface 73e8149cb0 Remove explicit image-features from rbd cmd
This should be managed in ceph.conf with the `rbd default
features` configuration option instead, and thus can be tailored to the
underlying OS version.
2021-07-30 11:33:59 -04:00
Joshua Boniface 4a7246b8c0 Ensure RBD resize has bytes appended
If this isn't, the resize will be interpreted as a MB value and result
in an absurdly big volume instead. This is the same consistency
validation that occurs on add.
2021-07-30 11:25:13 -04:00
Joshua Boniface c49351469b Revert "Ensure consistent sizing of volumes"
This reverts commit dc03e95bbf.
2021-07-29 15:30:00 -04:00
Joshua Boniface dc03e95bbf Ensure consistent sizing of volumes
Convert from human to bytes, then to megabytes and always pass this to
the RBD command. This ensures consistency regardless of what is actually
passed by the user.
2021-07-29 15:14:25 -04:00
Joshua Boniface 45f23c12ea Remove logs from schema validation
These are managed entirely by the logging subsystem not by the schema
handler due to catch-22's.
2021-07-20 00:00:37 -04:00
Joshua Boniface b14bc7e3a3 Add retry to log writes 2021-07-19 13:11:28 -04:00
Joshua Boniface 4d6842f942 Don't bail out if write fails, keep retrying 2021-07-19 13:09:36 -04:00
Joshua Boniface e9df043c0a Ensure ZK logging does not block startup 2021-07-19 12:19:59 -04:00
Joshua Boniface 5be968123f Readd 1 second queue get timeout
Otherwise daemon stops will sometimes inexplicably block.
2021-07-18 22:17:57 -04:00
Joshua Boniface 99fd7ebe63 Fix excessive CPU due to looping 2021-07-18 22:06:50 -04:00
Joshua Boniface cffc96d156 Fix failure in creating base keys 2021-07-18 21:00:23 -04:00
Joshua Boniface b770e15a91 Fix final termination of logger
We need to do a bit more finagling with the logger on termination to
ensure that all messages are written and the queue drained before
actually terminating.
2021-07-18 19:53:00 -04:00
Joshua Boniface 982dfd52c6 Adjust date output format 2021-07-18 19:00:54 -04:00
Joshua Boniface a088aa4484 Add node log functions to API and CLI 2021-07-18 18:54:28 -04:00
Joshua Boniface 323c7c41ae Implement node logging into Zookeeper
Adds the ability to send node daemon logs to Zookeeper to facilitate a
command like "pvc node log", similar to "pvc vm log". Each node stores
its logs in a separate tree under "/logs" which can then be combined or
queried. By default, set by config, only 2000 lines are kept.
2021-07-18 17:11:43 -04:00
Joshua Boniface 75fb60b1b4 Add VM list filtering by tag
Uses same method as state or node filtering, rather than altering how
the main LIMIT field works.
2021-07-14 00:59:20 -04:00
Joshua Boniface 9ea9ac3b8a Revamp tag handling and display
Add an additional protected class, limit manipulation to one at a time,
and ensure future flexibility. Also makes display consistent with other
VM elements.
2021-07-13 22:39:52 -04:00
Joshua Boniface c0a3467b70 Simplify VM metadata reads
Directly call the new common getDomainMetadata function to avoid
excessive Zookeeper calls for this information.
2021-07-13 19:05:33 -04:00
Joshua Boniface 9a199992a1 Add functions for manipulating VM tags
Adds tags to schema (v3), to VM definition, adds function to modify
tags, adds function to get tags, and adds tags to VM data output.

Tags will enable more granular classification of VMs based either on
administrator configuration or from automated system events.
2021-07-13 19:05:33 -04:00
Joshua Boniface c76149141f Only log ZK connections when persistent
Prevents spam in the API logs.
2021-07-10 23:35:49 -04:00
Joshua Boniface 0699c48d10 Fix bad schema path name 2021-07-09 16:47:09 -04:00
Joshua Boniface 4832245d9c Handle non-RBD disks and non-RBD errors better 2021-07-09 15:48:57 -04:00
Joshua Boniface 2138f2f59f Fail VM removal on disk removal failures
Prevents bad states where the VM is "removed" but some of its disks
remain due to e.g. stuck watchers.

Rearrange the sequence so it goes stop, delete disks, then delete VM,
and then return a failure if any of the disk(s) fail to remove, allowing
the task to be rerun after fixing the problem.
2021-07-09 15:39:06 -04:00
Joshua Boniface d1d355a96b Avoid errors if stats data is None 2021-07-09 13:13:54 -04:00
Joshua Boniface c0c9327a7d Return an empty log if the value is None 2021-07-09 13:08:00 -04:00
Joshua Boniface 5ffabcfef5 Avoid failing if we can't get the future data 2021-07-09 13:05:37 -04:00
Joshua Boniface 80fe96b24d Add some additional docstrings 2021-07-07 12:28:08 -04:00
Joshua Boniface 80f04ce8ee Remove connection renewal in state handler
Regenerating the ZK connection was fraught with issues, including
duplicate connections, strange failures to reconnect, and various other
wonkiness.

Instead let Kazoo handle states sensibly. Kazoo moves to SUSPENDED state
when it loses connectivity, and stays there indefinitely (based on
cursory tests). And Kazoo seems to always resume from this just fine on
its own. Thus all that hackery did nothing but complicate reconnection.

This therefore turns the listener into a purely informational function,
providing logs of when/why it failed, and we also add some additional
output messages during initial connection and final disconnection.
2021-07-07 11:55:12 -04:00
Joshua Boniface a8c28786dd Better handle empty ipaths in schema
When trying to write to sub-item paths that don't yet exist, the
previous method would just blindly write to whatever the root key is,
which is never what we actually want.

Instead, check explicitly for a "base path" situation, and handle that.
Then, if we try to get a subpath that isn't valid, return None. Finally
in the various functions, if the path is None, just continue (or return
false/None) and (try to) chug along.
2021-07-05 23:35:03 -04:00
Joshua Boniface c45804e8c1 Revert "Return none if a schema path is not found"
This reverts commit b1fcf6a4a5.
2021-07-05 23:16:39 -04:00
Joshua Boniface b1fcf6a4a5 Return none if a schema path is not found
This can cause overwriting of unintended keys, so should not be
happening. Will have to find the bugs this causes.
2021-07-05 17:15:55 -04:00
Joshua Boniface a69105569f Add node PVC version data to Node information
Allows API client to see the currently-active version of the node
daemon.
2021-07-05 09:57:38 -04:00
Joshua Boniface e44f3d623e Remove unnecessary try/except blocks from VM reads
The zkhandler read() function takes care of ensuring there is a None
value returned if these fail, so these aren't required. Makes the code a
fair bit more readable here.
2021-07-02 12:01:58 -04:00
Joshua Boniface 43009486ae Move Ceph pool/volume list assembly to thread pool
Same reasons as the VM list, though less impactful.
2021-07-01 17:33:13 -04:00
Joshua Boniface 58789f1db4 Move VM list assembly to thread pool
This helps parallelize the numerous Zookeeper calls a little bit, at
least within the bounds of the GIL, to improve performance when getting
a large list of VMs. The max_workers value is capped at 32 to avoid
causing too many threads during concurrent executions, but still
provides a noticeable speedup (on the order of 0.2-0.4 seconds with 75
VMs, scaling up further as counts grow).
2021-07-01 17:32:47 -04:00
Joshua Boniface baf4c3fbc7 Add performance profiler function
Usable anywhere that the global daemon "config" parameter can be passed
in (e.g. pvcapid/helper.py, pvcnoded/Daemon.py, etc.). Stores results in
a subdirectory of the PVC logdir called "profiler" if this directory can
be created, or prints results.

The debug config parameter ensures that the profiler can be added to
functions and not run unless the server is explicitly in debug mode.
Might not be useful as I don't initially plan to add this to every
function (only when investigating performance problems), but this
flexibility allows that to change later.
2021-07-01 14:01:33 -04:00
Joshua Boniface e093efceb1 Add NoNodeError handlers in ZK locks
Instead of looping 5+ times acquiring an impossible lock on a
nonexistent key, just fail on a different error and return failure
immediately.

This is likely a major corner case that shouldn't happen, but better to
be safe than 500.
2021-07-01 01:17:38 -04:00
Joshua Boniface a080598781 Avoid superfluous ZK exists calls
These cause a major (2x) slowdown in read calls since Zookeeper
connections are expensive/slow. Instead, just try the thing and return
None if there's no key there.

Also wrap the children command in similar error handling since that did
not exist and could likely cause some bugs at some point.
2021-07-01 01:15:51 -04:00
Joshua Boniface 6adaf1f669 Fix incorrect handling of deletions in init 2021-06-29 18:41:02 -04:00
Joshua Boniface f91c07fdcf Re-add UUID limit matching for full UUIDs
This *was* valuable when passing a full UUID in, so go back to that.
Verify first that the limit string is an actual UUID, and then compare
against it if applicable.
2021-06-28 12:27:43 -04:00
Joshua Boniface c54f66efa8 Limit match only on VM name
I can see no possible reason to want to do limits against UUIDs, but
supporting that means match is not what one would expect since a random
UUID could match the limit. So only limit based on the name.
2021-06-23 19:17:35 -04:00
Joshua Boniface cd860bae6b Optimize VM list in API
With many VMs this slows down linearly. Rework it a bit so there are
fewer calls to getInformationFromXML and so the processing could happen
in parallel at some point.
2021-06-23 19:14:26 -04:00
Joshua Boniface 07dbd55f03 Use list comprehension to compare against source 2021-06-22 02:31:14 -04:00
Joshua Boniface 6cd0ccf0ad Fix network check on VM config modification 2021-06-22 02:21:55 -04:00
Joshua Boniface e623909a43 Store PHY MAC for VFs and restore after free 2021-06-22 00:56:47 -04:00
Joshua Boniface 60e1da09dd Don't try any shenannegans when updating NICs
Trying to do this on the VMInstance side had problems because we can't
differentiate the 3 types of migration there. So, just update this in
the API side and hope everything goes well.

This introduces an edge bug: if a VM is using a macvtap SR-IOV device,
and then tries to migrate, and the migrate is aborted, the NIC lists
will be inconsistent.

When I revamp the VMInstance in the future, I should be able to correct
this, but for now we'll have to live with that edgecase.
2021-06-22 00:00:50 -04:00
Joshua Boniface dc560c1dcb Better handle retcodes in migrate update 2021-06-21 23:46:47 -04:00
Joshua Boniface 68c7481aa2 Ensure offline migrations update SR-IOV NIC states 2021-06-21 23:35:52 -04:00
Joshua Boniface 24ce361a04 Ensure SR-IOV NIC states are updated on migration 2021-06-21 23:18:34 -04:00