Move to Zookeeper as API worker broker #165
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The Redis broker has served well, but has some big limitations:
No replication between coordinators. While replication is possible, it is statically configured and not multi-master (dealbreaker).
The lack of replication makes each node separate, causing issues.
Prevents using the Celery worker for anything that isn't on "this node" i.e. the primary coordinator.
Investigate the experimental Zookeeper broker instead. If this works, it can open up several interesting avenues:
Provisioner can work against any arbitrary node, and can continue to report status if the primary node changes.
Other tasks can be delegated to the API worker, e.g. Ceph OSD operations or other similar node-specific operations.
Well, that's impossible, because the Redis devs are lazy. Support every random-ass thing under the sun but not ETCd or Zookeeper (i.e. sensible KV stores).
I guess the only option is to hack in Redis multi-master using Dynomite.
https://fatihmalakci.com/how-to-setup-redis-master-master-replication/
Was able to get this to work with Zookeeper as the message transport and PostgreSQL as the results backend, and it seems to work fine. I was just confused as Zookeeper is only usable as a Message Broker while I needed something else as the Results Backend, which PostgreSQL works fine as.
Code written and merged. Can use a custom kwarg in the function calls (likely always
run_on
in combination with the@celery.task(..., routing_key='run_on')
, to set the worker delegation for tasks that require it. All nodes will have workers, started along side the node daemon, so these tasks can be properly delegated to them leveraging the above queues.This necessitated a couple config changes:
pvcapid.yaml
, remove thequeue:
subsection as it is not required.pvcapid.yaml
, adjust the default PostgreSQL connection to be the cluster floating IP address instead of localhost.