Support uploading volume images to cluster #68
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
To help facilitate running arbitrary VMs and bypassing (most of) the provisioner, PVC should support uploading of arbitrary disk image files to the cluster.
As-is, there are a few options for how to go about this.
The Options
Node-local only
This way only allows uploading files that are actually present on the (primary) PVC node, but requires minimal changes to the logic of PVC's handling of Ceph volumes.
The administrator/user of the API (and via it the CLI as well) would pass a path to a node-local file to the client, which would then pass it on via the
/cmd/ceph
pipe in Zookeeper, as all current Ceph commands do, to the (primary) node daemon, which could then perform the required functions (analyzing, converting to raw format, creating Ceph volumes, writing the image to mapped Ceph volumes, etc.).Pros
Cons
CephInstance
section of the node daemon.Client-local
This way would involve the API client handling interaction with the file itself, including all stages of analyzing, converting, creating, and writing the image to the cluster. The CLI client would pass the raw contents of the image file to the API via HTTP.
Pros
Cons
Broken Assumptions
Currently the provisioner portion of the API makes the assumption that the machine running it has direct access to the Ceph cluster, since several stages of the provisioning
create_vm
sequence involve directly mapping Ceph volumes, writing to them, mounting them, etc.This breaks the original assumptions of the "
/ceph/cmd
pipe" architecture, which was that the client could be running in another arbitrary location with access only to Zookeeper and still "perform" Ceph-related commands.However, this architecture is, quite frankly, a big hack. I implemented it with the assumption that "the clients shouldn't touch the Ceph cluster", which made sense at the time, but the combination of (a) implementing the provisioner, (b) turning the CLI client into an API client; and (c) effectively requiring the API to run on the Primary coordinator, makes this assumption far less relevant. And during provisioner testing, I've seen at least one case where overloading this command pipe can be disastrous to the running daemon, so this would be a bugfix as well.
One of the original goals of this architecture was to be able to have an "API VM" on the cluster, which would have access to the "cluster" network but not the "storage" network directly. This would isolate the API from the Ceph cluster, which I saw as a security benefit. However this would still be possible with the provisioner in the current state, however, by adding a "brstorage" bridge to go alongside the "brcluster" bridge that currently exists, then simply ensuring that this special VM has access to both along with the Ceph cluster itself. That said, I've mostly abandoned the idea of separating the API like this, and doing something like this would be a completely manual operation. So the assumption that original triggered the "
/cmd/ceph
pipe" architecture has already been de facto abandoned.Takeaways and future plans
Given the broken assumption and the "ideal" second method of implementing image uploads, I think it makes a lot of sense to rework the Ceph command pipe, the assumptions of the common client functions regarding Ceph commands, and likely also add the "brstorage" bridge "just in case".
This would provide two very nice changes:
Simplify the client handling, deciding one-and-for-all that direct Ceph access, in addition to direct Zookeeper access, is required for the "core" clients - the API and any 3rd party, direct-binding client.
Simply the node handling, since it would no longer need to respond to the
/cmd/ceph
pipe or perform these functions itself.In the time it's taken me to write this, I've effectively decided that the second method, as well as the full Ceph client handling refactor, is definitely the way to go, but feedback requested especially in light of the cons. I will take a short break (1-2 weeks) before implementing this either way.
changed the description
changed the description
changed the description
changed the description
changed the description
changed the description
changed milestone to %2
mentioned in issue #80
Went with the "Client-local" methodology and have already eliminated the
/ceph/cmd
pipe in #80. Next step is implementing image upload functionality directly through the API.mentioned in commit
49e5ce1176
mentioned in commit
e419855911
closed via commit
1de57ab6f3
mentioned in commit
1231ba19b7