This helps parallelize the numerous Zookeeper calls a little bit, at
least within the bounds of the GIL, to improve performance when getting
a large list of VMs. The max_workers value is capped at 32 to avoid
causing too many threads during concurrent executions, but still
provides a noticeable speedup (on the order of 0.2-0.4 seconds with 75
VMs, scaling up further as counts grow).
Usable anywhere that the global daemon "config" parameter can be passed
in (e.g. pvcapid/helper.py, pvcnoded/Daemon.py, etc.). Stores results in
a subdirectory of the PVC logdir called "profiler" if this directory can
be created, or prints results.
The debug config parameter ensures that the profiler can be added to
functions and not run unless the server is explicitly in debug mode.
Might not be useful as I don't initially plan to add this to every
function (only when investigating performance problems), but this
flexibility allows that to change later.
Instead of looping 5+ times acquiring an impossible lock on a
nonexistent key, just fail on a different error and return failure
immediately.
This is likely a major corner case that shouldn't happen, but better to
be safe than 500.
These cause a major (2x) slowdown in read calls since Zookeeper
connections are expensive/slow. Instead, just try the thing and return
None if there's no key there.
Also wrap the children command in similar error handling since that did
not exist and could likely cause some bugs at some point.
This *was* valuable when passing a full UUID in, so go back to that.
Verify first that the limit string is an actual UUID, and then compare
against it if applicable.
I can see no possible reason to want to do limits against UUIDs, but
supporting that means match is not what one would expect since a random
UUID could match the limit. So only limit based on the name.
With many VMs this slows down linearly. Rework it a bit so there are
fewer calls to getInformationFromXML and so the processing could happen
in parallel at some point.
Trying to do this on the VMInstance side had problems because we can't
differentiate the 3 types of migration there. So, just update this in
the API side and hope everything goes well.
This introduces an edge bug: if a VM is using a macvtap SR-IOV device,
and then tries to migrate, and the migrate is aborted, the NIC lists
will be inconsistent.
When I revamp the VMInstance in the future, I should be able to correct
this, but for now we'll have to live with that edgecase.
Adds support for the node daemon managing SR-IOV PF and VF instances.
PFs are added to Zookeeper automatically based on the config at startup
during network configuration, and are otherwise completely static. PFs
are automatically removed from Zookeeper, along with all coresponding
VFs, should the PF phy device be removed from the configuration.
VFs are configured based on the (autocreated) VFs of each PF device,
added to Zookeeper, and then a new class instance, SRIOVVFInstance, is
used to watch them for configuration changes. This will enable the
runtime management of VF settings by the API. The set of keys ensures
that both configuration and details of the NIC can be tracked.
Most keys are self-explanatory, especially for PFs and the basic keys
for VFs. The configuration tree is also self-explanatory, being based
entirely on the options available in the `ip link set {dev} vf` command.
Two additional keys are also present: `used` and `used_by`, which will
be able to track the (boolean) state of usage, as well as the VM that
uses a given VIF. Since the VM side implementation will support both
macvtap and direct "hostdev" assignments, this will ensure that this
state can be tracked on both the VF and the VM side.
Required a bit of refactoring in the validation code to ensure we have
direct access, without relying on the translations done in the normal
zkhandler functions.
Adds a new class, ZKSchema, to handle schema management in Zookeeper in
an automated and consistent way. This should solve several issues:
1. Pain in managing changes to ZK keys
2. Pain in handling those changes during live upgrades
3. Simplifying the codebase to remove hardcoded ZK paths
The current master schema for PVC 0.9.19 is committed as version 0.
Addresses #129
Ensures that the bytes_tohuman returns an integer to avoid the hacky
workaround of stripping off the B.
Adds a verification on the size of a new volume, that it is not larger
than the free space of the pool to prevent errors/excessively-large
volumes from being created.
Closes#120
Sets in the node daemon, returns via the API, and shows in the CLI,
information about the live VNC listen address and port for VNC-enabled
VMs.
Closes#115
1. Use a consistent "is not None" to verify records are changing.
2. Fix bug where IPv6 network had no remove setter (it is now a blank
string, the first thing I would expect).
3. 1 fixes a bug whereby it was impossible to unset DHCPv4 status.
Allow a VM to specify its migration type as a default choice. The valid
options are "default" (i.e. behave as now), "live" which forces a live
migration only, and "shutdown" which forces a shutdown migration only.
The new option is treated as a VM meta option and is set to default if
not found.
Use exclusive locks during API events which change VM state. This is
fairly critical to avoid potential duplicate updates. Only implemented
for these specifically required functions to avoid major performance
hits elsewhere.
Adds a check of (n-1) memory overprovisioning. (n-1) is considered to be
the configuration that excludes the "largest" node. The cluster will
report degraded when in this state.
Use the new "provisioned" memory field, instead of the "allocated"
memory field, to determine the optimal node when using the "mem"
migration selector. This will take into account non-running VMs in the
calculation as well as running VMs.
Adds a separate field to the node memory, "provisioned", which totals
the amount of memory provisioned to all VMs on the node, regardless of
state, and in contrast to "allocated" which only counts running VMs.
Allows for the detection of potential overprovisioned states when
factoring in non-running VMs.
Includes the supporting code to get this data, since the original
implementation of VM memory selection was dependent on the VM being
running and getting this from libvirt. Now, if the VM is not active, it
gets this from the domain XML instead.
Provide textual explanations for the degraded status, including
specific node/VM/OSD issues as well as detailed Ceph health. "Single
pane of glass" mentality.
Makes this output a little more realistic and allows proper monitoring
of the Ceph cluster status (separate from the PVC status which is
tracking only OSD up/in state).
This wasn't happening automatically, nor does it happen with qemu-img
commands, so we have to manually trigger a libvirt blockResize against
the volume. This setup is a little roundabout but seems to work fine.
Provides a CLI and API argument to force live migration, which triggers
a new VM state "migrate-live". The node daemon VMInstance during migrate
will read this flag from the state and, if enforced, will not trigger a
shutdown migration.
Closes#95
Implements wait support for VM restart, shutdown, move, migrate, and
unmigrate commands, similar to node flush/node unflush.
Includes some additional refactoring of the move command to make its
operation identical to migrate, only without recording the previous
node.
References #72
Allow the user to specify other, non-raw files and upload them,
performing a conversion with qemu-img convert and a temporary block
device as a shim (since qemu-img can't use FIFOs).
Also ensures that the target volume exists before proceeding.
Addresses #68