Commit Graph

114 Commits

Author SHA1 Message Date
Joshua Boniface 726d0a562b Update copyright header year 2022-10-06 11:55:27 -04:00
Joshua Boniface c401a1f655 Use consistent language for primary mode
I didn't call it "router" anywhere else, but the state in the list is
called "coordinator" so, call it "coordinator mode".
2022-05-06 15:40:52 -04:00
Joshua Boniface 7a40c7a55b Add support for replacing/refreshing OSDs
Adds commands to both replace an OSD disk, and refresh (reimport) an
existing OSD disk on a new node. This handles the cases where an OSD
disk should be replaced (either due to upgrades or failures) or where a
node is rebuilt in-place and an existing OSD must be re-imported to it.

This should avoid the need to do a full remove/add sequence for either
case.

Also cleans up some aspects of OSD removal that are identical between
methods (e.g. using safe-to-destroy and sleeping after stopping) and
fixes a bug if an OSD does not truly exist when the daemon starts up.
2022-05-06 15:32:06 -04:00
Joshua Boniface 4d698be34b Add OSD removal force option
Ensures a removal can continue even in situations where some step(s)
might fail, for instance removing an obsolete OSD from a replaced node.
2022-04-29 11:16:33 -04:00
Joshua Boniface 1142454934 Add pool PGs count modification
Allows an administrator to adjust the PG count of a given pool. This can
be used to increase the PGs (for example after adding more OSDs) or
decrease it (to remove OSDs, reduce CPU load, etc.).
2021-12-28 21:53:29 -05:00
Joshua Boniface 25fe45dd28 Add device class tiers to Ceph pools
Allows specifying a particular device class ("tier") for a given pool,
for instance SSD-only or NVMe-only. This is implemented with Crush
rules on the Ceph side, and via an additional new key in the pool
Zookeeper schema which is defaulted to "default".
2021-12-28 20:58:15 -05:00
Joshua Boniface 5f193a6134 Perform automatic shutdown/stop on VM disable
Instead of requiring the VM to already be stopped, instead allow disable
state changes to perform a shutdown first. Also add a force option which
will do a hard stop instead of a shutdown.

References #148
2021-11-06 03:57:24 -04:00
Joshua Boniface c41664d2da Reformat code with Black code formatter
Unify the code style along PEP and Black principles using the tool.
2021-11-06 03:02:43 -04:00
Joshua Boniface 24de0f4189 Add MTU to network creation/modification
Addresses #144
2021-10-09 17:51:32 -04:00
Joshua Boniface c0f7ba0125 Add limit negation to VM list
When using the "state", "node", or "tag" arguments to a VM list, add
support for a "negate" flag to look for all VMs *not in* the state,
node, or tag state.
2021-10-07 11:50:52 -04:00
Joshua Boniface 65df807b09 Add support for configurable OSD DB ratios
The default of 0.05 (5%) is likely ideal in the initial implementation,
but allow this to be set explicitly for maximum flexibility in
space-constrained or performance-critical use-cases.
2021-09-24 01:06:39 -04:00
Joshua Boniface adc8a5a3bc Add separate OSD DB device support
Adds in three parts:

1. Create an API endpoint to create OSD DB volume groups on a device.
Passed through to the node via the same command pipeline as
creating/removing OSDs, and creates a volume group with a fixed name
(osd-db).

2. Adds API support for specifying whether or not to use this DB volume
group when creating a new OSD via the "ext_db" flag. Naming and sizing
is fixed for simplicity and based on Ceph recommendations (5% of OSD
size). The Zookeeper schema tracks the block device to use during
removal.

3. Adds CLI support for the new and modified API endpoints, as well as
displaying the block device and DB block device in the OSD list.

While I debated supporting adding a DB device to an existing OSD, in
practice this ended up being a very complex operation involving stopping
the OSD and setting some options, so this is not supported; this can be
specified during OSD creation only.

Closes #142
2021-09-23 13:59:49 -04:00
Joshua Boniface e962743e51 Add VM device hot attach/detach support
Adds a new API endpoint to support hot attach/detach of devices, and the
corresponding client-side logic to use this endpoint when doing VM
network/storage add/remove actions.

The live attach is now the default behaviour for these types of
additions and removals, and can be disabled if needed.

Closes #141
2021-09-12 19:33:00 -04:00
Joshua Boniface a088aa4484 Add node log functions to API and CLI 2021-07-18 18:54:28 -04:00
Joshua Boniface 75fb60b1b4 Add VM list filtering by tag
Uses same method as state or node filtering, rather than altering how
the main LIMIT field works.
2021-07-14 00:59:20 -04:00
Joshua Boniface 9ea9ac3b8a Revamp tag handling and display
Add an additional protected class, limit manipulation to one at a time,
and ensure future flexibility. Also makes display consistent with other
VM elements.
2021-07-13 22:39:52 -04:00
Joshua Boniface 27f1758791 Add tags manipulation to API
Also fixes some checks for Metadata too since these two actions are
almost identical, and adds tags to define endpoint.
2021-07-13 19:05:33 -04:00
Joshua Boniface c0a3467b70 Simplify VM metadata reads
Directly call the new common getDomainMetadata function to avoid
excessive Zookeeper calls for this information.
2021-07-13 19:05:33 -04:00
Joshua Boniface 61465ef38f Add profiler to several other functions in API 2021-07-02 01:53:19 -04:00
Joshua Boniface 20542c3653 Add profiler to cluster status function 2021-07-01 17:35:29 -04:00
Joshua Boniface 13cc0f986f Implement SR-IOV VF config set
Also fixes some random bugs, adds proper interface sorting, and assorted
tweaks.
2021-06-21 18:40:11 -04:00
Joshua Boniface a697c2db2e Add SRIOV PF and VF listing to API 2021-06-21 01:42:55 -04:00
Joshua Boniface 01c82f5d19 Move backup and restore into common 2021-06-13 14:25:51 -04:00
Joshua Boniface a48bf2d71e More gracefully handle none selectors
Allow selection of "none" as the node selector, and handle this by
always using the cluster default instead of writing it in.
2021-06-01 11:13:13 -04:00
Joshua Boniface 33a54cf7f2 Move configuration keys to /config tree 2021-06-01 10:48:55 -04:00
Joshua Boniface a1969eb981 Allow overwrite during init command 2021-05-31 00:12:28 -04:00
Joshua Boniface c7992000eb Explicitly output JSON cluster data 2021-05-30 23:50:42 -04:00
Joshua Boniface a1e8cc5867 Skip patroni tree during backups 2021-05-30 23:39:37 -04:00
Joshua Boniface 163015bd4a Port remaining helper functions to ZKConnection 2021-05-29 00:30:42 -04:00
Joshua Boniface 9cd121ef9f Convert remaining VM functions 2021-05-29 00:16:26 -04:00
Joshua Boniface ea63a58b21 Port two more functions to new decorator 2021-05-28 23:38:53 -04:00
Joshua Boniface c6bececb55 Revamp config parsing and imports
Brings sanity to the passing of the config variable around the various
submodules for use in the ZKConnection decorator.
2021-05-28 23:33:36 -04:00
Joshua Boniface f46c2e7f6a Implement VM rename functionality
Closes #125
2021-05-23 17:21:19 -04:00
Joshua Boniface 0bf276fd51 Update copyright year in headers 2021-03-25 17:01:55 -04:00
Joshua Boniface f4ec161aa2 Update file copyright header.
Remove the option to select a later version of the GPL.
2021-03-25 16:58:02 -04:00
Joshua Boniface 2ac31e0a14 Handle issues with state retrieval 2020-12-08 23:26:29 -05:00
Joshua Boniface 185615e6e8 Don't strip single-element lists
This was a dumb decision that complicated handling of single-item
entries.
2020-12-01 03:23:18 -05:00
Joshua Boniface 8f705c9cc2 Add cluster backup + restore functionality
Adds cluster backup (JSON dump) and restore functions for use in
disaster recovery.

Further, adds additional confirmation to the initialization (as well as
restore) endpoints to avoid accidental triggering, and also groups the
init, backup, and restore commands in the CLI into a new "task"
subsection.
2020-11-24 02:39:06 -05:00
Joshua Boniface ee4d682b29 Correct faulty function naming 2020-11-09 09:45:54 -05:00
Joshua Boniface 260b39ebf2 Lint: E302 expected 2 blank lines, found X 2020-11-07 14:45:24 -05:00
Joshua Boniface d74cf00feb Lint: F821 undefined name 'data'
Not really a lint, but simply makes the image uploader work the same way
that the OVA uploader does. May need more tweaking if this broke it.
2020-11-07 13:26:12 -05:00
Joshua Boniface 6c56d45345 Lint: F821 undefined name 'config'
This variable is set after importing these files by the flaskapi module.
Thus, simply set a default at the top of each file to avoid linting
errors.
2020-11-07 13:23:34 -05:00
Joshua Boniface 975b52ad8e Lint: E128 continuation line under-indented for visual indent 2020-11-07 13:07:07 -05:00
Joshua Boniface 3f242cd437 Lint: E202 whitespace before '}' 2020-11-07 12:57:42 -05:00
Joshua Boniface e333f2b935 Lint: E201 whitespace after '{' 2020-11-07 12:38:31 -05:00
Joshua Boniface 292ccdd94e Lint: E231 missing whitespace after ':' 2020-11-07 12:34:47 -05:00
Joshua Boniface cb2defbde9 Lint: W391 blank line at end of file 2020-11-06 21:14:19 -05:00
Joshua Boniface fde8ea2fea Lint: W291 trailing whitespace 2020-11-06 19:44:14 -05:00
Joshua Boniface d9e7b7ec15 Lint: F401 <library> imported but unused 2020-11-06 19:22:49 -05:00
Joshua Boniface 63f4f9aed7 Lint: E722 do not use bare 'except' 2020-11-06 18:55:10 -05:00
Joshua Boniface ec0b8acf90 Support per-VM migration type selectors
Allow a VM to specify its migration type as a default choice. The valid
options are "default" (i.e. behave as now), "live" which forces a live
migration only, and "shutdown" which forces a shutdown migration only.
The new option is treated as a VM meta option and is set to default if
not found.
2020-10-29 12:01:29 -04:00
Joshua Boniface ffaa4c033f Improve handling of large file uploads
By default, Werkzeug would require the entire file (be it an OVA or
image file) to be uploaded and saved to a temporary, fake file under
`/tmp`, before any further processing could occur. This blocked most of
the execution of these functions until the upload was completed.

This entirely defeated the purpose of what I was trying to do, which was
to save the uploads directly to the temporary blockdev in each case,
thus avoiding any sort of memory or (host) disk usage.

The solution is two-fold:

  1. First, ensure that the `location='args'` value is set in
  RequestParser; without this, the `files` portion would be parsed
  during the argument parsing, which was the original source of this
  blocking behaviour.

  2. Instead of the convoluted request handling that was being done
  originally here, instead entirely defer the parsing of the `files`
  arguments until the point in the code where they are ready to be
  saved. Then, using an override stream_factory that simply opens the
  temporary blockdev, the upload can commence while being written
  directly out to it, rather than using `/tmp` space.

This does alter the error handling slightly; it is impossible to check
if the argument was passed until this point in the code, so it may take
longer to fail if the API consumer does not specify a file as they
should. This is a minor trade-off and I would expect my API consumers to
be sane here.
2020-10-19 01:00:34 -04:00
Joshua Boniface 37a58d35e8 Implement limiting of node output
Closes #98
2020-06-25 11:51:53 -04:00
Joshua Boniface 654a3cb7fa Improve debug output and use ceph df util data 2020-06-06 22:52:49 -04:00
Joshua Boniface ce60836c34 Allow enforcement of live migration
Provides a CLI and API argument to force live migration, which triggers
a new VM state "migrate-live". The node daemon VMInstance during migrate
will read this flag from the state and, if enforced, will not trigger a
shutdown migration.

Closes #95
2020-06-06 12:00:44 -04:00
Joshua Boniface ca5327b908 Make strtobool even more robust
If strtobool fails, return False always.
2020-03-09 09:30:16 -04:00
Joshua Boniface d36d8e0637 Use custom strtobool to handle weird edge cases 2020-03-06 09:40:13 -05:00
Joshua Boniface 56a9e48163 Normalize all return messages
Ensure all API return messages are formated the same: no "error", a
final period except when displaying Exception text, and a regular spaced
out format.
2020-02-20 22:42:19 -05:00
Joshua Boniface 8678dedfea Revert "Implement wait for node coordinator transition"
This reverts commit 0aefafa7f7.

This does not work since the API goes away during the transition.

References #72
2020-02-19 10:50:21 -05:00
Joshua Boniface 0aefafa7f7 Implement wait for node coordinator transition
References #72
2020-02-19 10:50:04 -05:00
Joshua Boniface 99f579e41a Add wait support to API commands
References #72
2020-02-19 09:51:42 -05:00
Joshua Boniface e419855911 Support converting types during upload
Allow the user to specify other, non-raw files and upload them,
performing a conversion with qemu-img convert and a temporary block
device as a shim (since qemu-img can't use FIFOs).

Also ensures that the target volume exists before proceeding.

Addresses #68
2020-02-09 20:29:12 -05:00
Joshua Boniface 49e5ce1176 Support uploading disk images to volumes in API
Addresses #68
2020-02-09 13:45:04 -05:00
Joshua Boniface ce985234c3 Use consistent naming of components
Rename "pvcd" to "pvcnoded", and "pvc-api" to "pvcapid" so names for the
daemons are fully consistent. Update the names of the configuration
files as well to match this new formatting.

References #79
2020-02-08 19:34:07 -05:00