Move to Celery jobs for per-node tasks #166
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Currently, there are at least 2 broad categories of tasks that use a custom hacky Zookeeper-as-message-queue system to deliver tasks from the API to a particular node:
base.cmd.ceph
, indaemon-common/ceph.py
base.cmd.domain
, indaemon-common/vm.py
The current implementation is very suboptimal for a few reasons:
Execution is handled by the node daemon, rather than the API, which contrasts with all other commands. This makes the API code harder to understand and manage as it is split into two places.
Execution is potentially flaky, with nodes possibly ignoring commands based on the node daemon state or other variables.
There is no possibility of callbacks to the API for status, especially on errors, leading to suboptimal "see node logs" messages.
With the work done in #165, it is now possible to use the API worker to execute these tasks instead.
Move the called tasks in the node daemon into the daemon lib, and execute them as Celery tasks through the API instead, using the newly created
@celery.task(..., routing_key='blah')
and correspondingblah=<node>
argument to determine where to run them from the API, similar to how the provisioner create and storage benchmark commands currently work. The routing key will likely just berun_on
for both, and thus the target node can be passed to a customrun_on
kwarg in the calling function. All of these can then have task status commands, which may require a client refactor but will allow for consistent updates to the status.We can then deprecate the old
base.cmd
endpoints in Zookeeper as they will no longer be used.VM tasks are easier to test, so going to do those first.
VM tasks are implemented; next up are OSD tasks.
Completed with version 0.9.81 release.