Gevent was completely failure. The API would block during large file
uploads with no obvious solutions beyond "use gunicorn", which is not
suited to this. I originally had this working with the Flask "debug"
server, so just move to using that all the time. SSL is added using a
custom context with the OpenSSL library, so include that as a
dependency.
Use exclusive locks during API events which change VM state. This is
fairly critical to avoid potential duplicate updates. Only implemented
for these specifically required functions to avoid major performance
hits elsewhere.
Avoids situations where two migrates, to different nodes, happen in
rapid succession. Aborts the migration if the current target node no
longer matches what was set at the start of the execution.
The VM migration code was very old, very spaghettified, and prone to
strange failures.
Improve this by taking cues from the node primary migration. Use
synchronization between the nodes to ensure lockstep completion of the
migration in discrete steps.
A proper queue can be built later to integrate with this code more
cleanly.
References #108
By default, Werkzeug would require the entire file (be it an OVA or
image file) to be uploaded and saved to a temporary, fake file under
`/tmp`, before any further processing could occur. This blocked most of
the execution of these functions until the upload was completed.
This entirely defeated the purpose of what I was trying to do, which was
to save the uploads directly to the temporary blockdev in each case,
thus avoiding any sort of memory or (host) disk usage.
The solution is two-fold:
1. First, ensure that the `location='args'` value is set in
RequestParser; without this, the `files` portion would be parsed
during the argument parsing, which was the original source of this
blocking behaviour.
2. Instead of the convoluted request handling that was being done
originally here, instead entirely defer the parsing of the `files`
arguments until the point in the code where they are ready to be
saved. Then, using an override stream_factory that simply opens the
temporary blockdev, the upload can commence while being written
directly out to it, rather than using `/tmp` space.
This does alter the error handling slightly; it is impossible to check
if the argument was passed until this point in the code, so it may take
longer to fail if the API consumer does not specify a file as they
should. This is a minor trade-off and I would expect my API consumers to
be sane here.
Adds a check of (n-1) memory overprovisioning. (n-1) is considered to be
the configuration that excludes the "largest" node. The cluster will
report degraded when in this state.
Use the new "provisioned" memory field, instead of the "allocated"
memory field, to determine the optimal node when using the "mem"
migration selector. This will take into account non-running VMs in the
calculation as well as running VMs.
Adds a separate field to the node memory, "provisioned", which totals
the amount of memory provisioned to all VMs on the node, regardless of
state, and in contrast to "allocated" which only counts running VMs.
Allows for the detection of potential overprovisioned states when
factoring in non-running VMs.
Includes the supporting code to get this data, since the original
implementation of VM memory selection was dependent on the VM being
running and getting this from libvirt. Now, if the VM is not active, it
gets this from the domain XML instead.
Prevents any potential leakage due to autoconfigured IPv6 on bridged
interfaces. These are exclusively VM-side bridges, and the PVC host
should not have any IPv6 configuration on them, ever.
Prevents a bug where the thread can crash due to a change in the
d_domain object while running the for loop. By copying and iterating
over the copy, this becomes safer.