2262 Commits

Author SHA1 Message Date
20542c3653 Add profiler to cluster status function 2021-07-01 17:35:29 -04:00
00b503824e Set unstable version in API and CLI too 2021-07-01 17:35:11 -04:00
43009486ae Move Ceph pool/volume list assembly to thread pool
Same reasons as the VM list, though less impactful.
2021-07-01 17:33:13 -04:00
58789f1db4 Move VM list assembly to thread pool
This helps parallelize the numerous Zookeeper calls a little bit, at
least within the bounds of the GIL, to improve performance when getting
a large list of VMs. The max_workers value is capped at 32 to avoid
causing too many threads during concurrent executions, but still
provides a noticeable speedup (on the order of 0.2-0.4 seconds with 75
VMs, scaling up further as counts grow).
2021-07-01 17:32:47 -04:00
baf4c3fbc7 Add performance profiler function
Usable anywhere that the global daemon "config" parameter can be passed
in (e.g. pvcapid/helper.py, pvcnoded/Daemon.py, etc.). Stores results in
a subdirectory of the PVC logdir called "profiler" if this directory can
be created, or prints results.

The debug config parameter ensures that the profiler can be added to
functions and not run unless the server is explicitly in debug mode.
Might not be useful as I don't initially plan to add this to every
function (only when investigating performance problems), but this
flexibility allows that to change later.
2021-07-01 14:01:33 -04:00
e093efceb1 Add NoNodeError handlers in ZK locks
Instead of looping 5+ times acquiring an impossible lock on a
nonexistent key, just fail on a different error and return failure
immediately.

This is likely a major corner case that shouldn't happen, but better to
be safe than 500.
2021-07-01 01:17:38 -04:00
a080598781 Avoid superfluous ZK exists calls
These cause a major (2x) slowdown in read calls since Zookeeper
connections are expensive/slow. Instead, just try the thing and return
None if there's no key there.

Also wrap the children command in similar error handling since that did
not exist and could likely cause some bugs at some point.
2021-07-01 01:15:51 -04:00
39e82ee426 Cast base schema version to int
Or all our comparisons will fail later and nodes can't start.
v0.9.21
2021-06-30 09:40:33 -04:00
fe0a1d582a Bump version to 0.9.21 2021-06-29 19:21:31 -04:00
64feabf7c0 Fix adds in bump-version 2021-06-29 19:20:13 -04:00
cc841898b4 Pad dimensions of logos slightly 2021-06-29 19:16:35 -04:00
de5599b800 Revert "Try dark image instead"
This reverts commit 87f963df4ceada62f1a7cf002a755059cbdb0d2a.
2021-06-29 19:11:38 -04:00
87f963df4c Try dark image instead 2021-06-29 19:09:39 -04:00
de23c8c57e Add background colours to logos 2021-06-29 19:08:10 -04:00
c62a0a6c6a Revamp introduction text 2021-06-29 18:47:01 -04:00
12212177ef Update to new logo 2021-06-29 18:41:36 -04:00
6adaf1f669 Fix incorrect handling of deletions in init 2021-06-29 18:41:02 -04:00
b05c93e260 Fix bad return from initialize call 2021-06-29 18:31:56 -04:00
ffdd6bf3f8 Fix typo in command argument 2021-06-29 18:22:39 -04:00
aae9ae2e80 Fix incorrect handling of overwrite flag 2021-06-29 18:22:01 -04:00
f91c07fdcf Re-add UUID limit matching for full UUIDs
This *was* valuable when passing a full UUID in, so go back to that.
Verify first that the limit string is an actual UUID, and then compare
against it if applicable.
2021-06-28 12:27:43 -04:00
4e2a1c3e52 Add worker wrapper to fix Deb incompatibility
Celery 5.x introduced a new worker argument format that is not
backwards-compatible with the older Celery 4.x format. This created a
conundrum since we use one service unit for both Debian 10 (4.x) and
Debian 11 (5.x). Instead of worse hacks, create a wrapper script to
start the worker with the correct arguments instead.
2021-06-28 12:19:29 -04:00
dbfa339cfb Ensure postinst and prerm always succeed 2021-06-23 20:35:40 -04:00
c54f66efa8 Limit match only on VM name
I can see no possible reason to want to do limits against UUIDs, but
supporting that means match is not what one would expect since a random
UUID could match the limit. So only limit based on the name.
2021-06-23 19:17:35 -04:00
cd860bae6b Optimize VM list in API
With many VMs this slows down linearly. Rework it a bit so there are
fewer calls to getInformationFromXML and so the processing could happen
in parallel at some point.
2021-06-23 19:14:26 -04:00
bbb132414c Restore shebang and don't do store if completion 2021-06-23 05:26:50 -04:00
04fa63f081 Only hit the network endpoint once
Otherwise this is hit for every VM which gets very slow very fast.
2021-06-23 05:15:48 -04:00
f248d579df Convert pvc-client-cli into a proper Python module
Also fixes up the Debian packaging such that this works how I would
want, with proper module installation while leaving everything else
untouched. Finally implements automatic installation and removal of the
BASH completion for the PVC command.
2021-06-23 05:03:19 -04:00
f0db631947 Ignore root-level venv for testing 2021-06-23 02:15:49 -04:00
e91a597591 Merge branch 'sriov'
Implement SR-IOV support in PVC.

Closes #130
2021-06-23 00:58:44 -04:00
8d21da9041 Add some additional interaction tests 2021-06-22 22:08:51 -04:00
1ae34c1960 Fix bad messages in volume remove 2021-06-22 04:31:02 -04:00
75f2560217 Add documentation on SR-IOV client networks 2021-06-22 04:20:38 -04:00
7d2b7441c2 Mention SR-IOV in the Daemon and Ansible manuals 2021-06-22 03:55:19 -04:00
5ec198bf98 Update API doc with remaining items 2021-06-22 03:47:27 -04:00
e6b26745ce Adjust some help messages in pvc.py 2021-06-22 03:40:21 -04:00
3490ecbb59 Remove explicit ZK address from Patronictl command 2021-06-22 03:31:06 -04:00
2928d695c9 Ensure migration method is updated on state changes 2021-06-22 03:20:15 -04:00
7d2a3b5361 Ensure Macvtap NICs can use a model
Defaults to virtio like a bridged NIC. Otherwise performance is abysmal.
2021-06-22 02:38:16 -04:00
07dbd55f03 Use list comprehension to compare against source 2021-06-22 02:31:14 -04:00
26dd24e3f5 Ensure MTU is set on VF when starting up 2021-06-22 02:26:14 -04:00
6cd0ccf0ad Fix network check on VM config modification 2021-06-22 02:21:55 -04:00
1787a970ab Fix bug in address check format string 2021-06-22 02:21:32 -04:00
e623909a43 Store PHY MAC for VFs and restore after free 2021-06-22 00:56:47 -04:00
60e1da09dd Don't try any shenannegans when updating NICs
Trying to do this on the VMInstance side had problems because we can't
differentiate the 3 types of migration there. So, just update this in
the API side and hope everything goes well.

This introduces an edge bug: if a VM is using a macvtap SR-IOV device,
and then tries to migrate, and the migrate is aborted, the NIC lists
will be inconsistent.

When I revamp the VMInstance in the future, I should be able to correct
this, but for now we'll have to live with that edgecase.
2021-06-22 00:00:50 -04:00
dc560c1dcb Better handle retcodes in migrate update 2021-06-21 23:46:47 -04:00
68c7481aa2 Ensure offline migrations update SR-IOV NIC states 2021-06-21 23:35:52 -04:00
7d42fba373 Ensure being in migrate doesn't abort shutdown 2021-06-21 23:28:53 -04:00
b532bc9104 Add missing managed flag for hostdev 2021-06-21 23:22:36 -04:00
24ce361a04 Ensure SR-IOV NIC states are updated on migration 2021-06-21 23:18:34 -04:00