By default, Werkzeug would require the entire file (be it an OVA or
image file) to be uploaded and saved to a temporary, fake file under
`/tmp`, before any further processing could occur. This blocked most of
the execution of these functions until the upload was completed.
This entirely defeated the purpose of what I was trying to do, which was
to save the uploads directly to the temporary blockdev in each case,
thus avoiding any sort of memory or (host) disk usage.
The solution is two-fold:
1. First, ensure that the `location='args'` value is set in
RequestParser; without this, the `files` portion would be parsed
during the argument parsing, which was the original source of this
blocking behaviour.
2. Instead of the convoluted request handling that was being done
originally here, instead entirely defer the parsing of the `files`
arguments until the point in the code where they are ready to be
saved. Then, using an override stream_factory that simply opens the
temporary blockdev, the upload can commence while being written
directly out to it, rather than using `/tmp` space.
This does alter the error handling slightly; it is impossible to check
if the argument was passed until this point in the code, so it may take
longer to fail if the API consumer does not specify a file as they
should. This is a minor trade-off and I would expect my API consumers to
be sane here.
Adds a check of (n-1) memory overprovisioning. (n-1) is considered to be
the configuration that excludes the "largest" node. The cluster will
report degraded when in this state.
Use the new "provisioned" memory field, instead of the "allocated"
memory field, to determine the optimal node when using the "mem"
migration selector. This will take into account non-running VMs in the
calculation as well as running VMs.
Adds a separate field to the node memory, "provisioned", which totals
the amount of memory provisioned to all VMs on the node, regardless of
state, and in contrast to "allocated" which only counts running VMs.
Allows for the detection of potential overprovisioned states when
factoring in non-running VMs.
Includes the supporting code to get this data, since the original
implementation of VM memory selection was dependent on the VM being
running and getting this from libvirt. Now, if the VM is not active, it
gets this from the domain XML instead.
Prevents any potential leakage due to autoconfigured IPv6 on bridged
interfaces. These are exclusively VM-side bridges, and the PVC host
should not have any IPv6 configuration on them, ever.
Prevents a bug where the thread can crash due to a change in the
d_domain object while running the for loop. By copying and iterating
over the copy, this becomes safer.
The keepalive was getting stuck gathering memoryStats from the
non-running VM, since it was in a paused state. Avoid this by just
skipping past the rest of the stats gathering if the VM isn't running.
Since these will almost always connect to an IP rather than a "real"
hostname, don't verify the SSL cert (if applicable). Also allow the
overriding of SSL verification via an environment variable.
As a consequence, to reduce spam, SSL warnings are disabled for urllib3.
Instead, we warn in the "Using cluster" output whenever verification is
disabled.
Don't try to chmod every time, instead only chmod when first creating
the file. Also allow loading the default permission from an envvar
rather than hardcoding it.
1. Only build on GitLab when there's a tag.
2. Add the packages on GitLab to component "pvc" in the repo.
3. Add build-unstable-deb.sh script to build git-versioned packages.
4. Revamp build-and-deploy to use build-unstable-deb.sh and cut down on
output.
Most of these would silently fail if there was e.g. an issue with the ZK
connection. Instead, encase things in try blocks and handle the
exceptions in a more graceful way, returning None or False if
applicable. Except for locks, which should retry 5 times before
aborting.
Using simple print statements was annoying (lack of timing info and
formatting), so move to using the debug logger for these instead with a
custom state ('d') with white text to differentiate them. Also indicate
which subthread of the keepalive each task is being executed in for
easier tracing of issues.
Provide textual explanations for the degraded status, including
specific node/VM/OSD issues as well as detailed Ceph health. "Single
pane of glass" mentality.
Makes this output a little more realistic and allows proper monitoring
of the Ceph cluster status (separate from the PVC status which is
tracking only OSD up/in state).