Waiting for the daemons to stop took too much time on some nodes and
could throw off the lockstep. Instead, leverage background=True to run
the systemctl os_commands in the background (when they complete is
irrelevant), stop the Metadata API first, and don't delay during its
stop at all.
Tasks are no longer bound to the primary coordinator for state updates
due to using KeyDB and a proper shared queue and result backend, so this
warning is now obsolete and no longer required.
This would interrupt "--wait" commands on provisioner tasks, but we no
longer believe that this warrants a warning, as the affected user could
simply run "pvc cluster task" to validate or resume the watcher.
Removes the obsoleted "pvc provisioner status" command and replaces it
with a generalized "pvc cluster task" command to show all
currently-active or pending tasks on the cluster workers.
Move the create_vm and run_benchmark tasks to use the new Celery
subsystem, handlers, and wait command. Remove the obsolete, dedicated
API endpoints.
Standardize the CLI client and move the repeated handler code into a
separate common function.
Previously, we were assigning memalloc/memprov/vcpualloc during an
earlier phase using the main d_domain list. I'm not sure exactly why,
but this was throwing off stats after a fence. Instead, set these values
later on while parsing the actually-active VMs.
This is still needed due to the nature of the locks and freeing them on
startup, and to preserve lock=fail behaviour on VM startup.
Also fixes the fencing lock flush to directly use the client library
outside of Celery. I don't like this hack but it seems prudent until we
move fencing to the workers as well.
By default, tasks will continue to run as they did, on the primary
coordinator's task runner. However this opens the possibility for
defining more tasks that will run on other nodes or coordinators.
Redis did not provide a distributed solution for the worker, which
precluded several important planned functions. So instead, move to using
Zookeeper + PostgreSQL as the broker and result backend respectively.
Should be a seamless drop-in change but for future uses requires the
database host to be the primary coordinator IP rather than localhost, so
that writes can occur to the database from non-primary hosts.