Joshua Boniface
b413e042a6
Previously, contention could occasionally cause a flap/dual primary contention state due to the lack of checking within this function. This could cause a state where a node transitions to primary than is almost immediately shifted away, which could cause undefined behaviour in the cluster. The solution includes several elements: * Implement an exclusive lock operation in zkhandler * Switch the become_primary function to use this exclusive lock * Implement exclusive locking during the contention process * As a failsafe, check stat versions before setting the node as the primary node, in case another node already has * Delay the start of takeover/relinquish operations by slightly longer than the lock timeout * Make the current router_state conditions more explicit (positive conditionals rather than negative conditionals) The new scenario ensures that during contention, only one secondary will ever succeed at acquiring the lock. Ideally, the other would then grab the lock and pass, but in testing this does not seem to be the case - the lock always times out, so the failsafe check is technically not needed but has been left as an added safety mechanism. With this setup, the node that fails the contention will never block the switchover nor will it try to force itself onto the cluster after another node has successfully won contention. Timeouts may need to be adjusted in the future, but the base timeout of 0.4 seconds (and transition delay of 0.5 seconds) seems to work reliably during preliminary tests. |
||
---|---|---|
api-daemon | ||
client-cli | ||
daemon-common | ||
debian | ||
docs | ||
node-daemon | ||
.file-header | ||
.gitignore | ||
.gitlab-ci.yml | ||
LICENSE | ||
README.md | ||
build-and-deploy.sh | ||
build-deb.sh | ||
gen-api-doc | ||
gen-api-migrations | ||
mkdocs.yml | ||
pvc_logo.svg |
README.md
PVC - The Parallel Virtual Cluster system
NOTICE FOR GITHUB: This repository is a read-only mirror of the PVC repositories from my personal GitLab instance. Pull requests submitted here will not be merged. Issues submitted here will however be treated as authoritative.
PVC is a KVM+Ceph+Zookeeper-based, Free Software, scalable, redundant, self-healing, and self-managing private cloud solution designed with administrator simplicity in mind. It is built from the ground-up to be redundant at the host layer, allowing the cluster to gracefully handle the loss of nodes or their components, both due to hardware failure or due to maintenance. It is able to scale from a minimum of 3 nodes up to 12 or more nodes, while retaining performance and flexibility, allowing the administrator to build a small cluster today and grow it as needed.
The major goal of PVC is to be administrator friendly, providing the power of Enterprise-grade private clouds like OpenStack, Nutanix, and VMWare to homelabbers, SMBs, and small ISPs, without the cost or complexity. It believes in picking the best tool for a job and abstracting it behind the cluster as a whole, freeing the administrator from the boring and time-consuming task of selecting the best component, and letting them get on with the things that really matter. Administration can be done from a simple CLI or via a RESTful API capable of building full-featured web frontends or additional applications, taking a self-documenting approach to keep the administrator learning curvet as low as possible. Setup is easy and straightforward with an ISO-based node installer and Ansible role framework designed to get a cluster up and running as quickly as possible. Build your cloud in an hour, grow it as you need, and never worry about it: just add physical servers.
Getting Started
To get started with PVC, read the Cluster Architecture document, then see Installing for details on setting up a set of PVC nodes, using the PVC Ansible framework to configure and bootstrap a cluster, and managing it with the pvc
CLI tool or RESTful HTTP API. For details on the project, its motivation, and architectural details, see the About page.
Changelog
v0.7
Numerous improvements and bugfixes, revamped documentation. This release is suitable for general use and is beta-quality software.
v0.6
Numerous improvements and bugfixes, full implementation of the provisioner, full implementation of the API CLI client (versus direct CLI client). This release is suitable for general use and is beta-quality software.
v0.5
First public release; fully implements the VM, network, and storage managers, the HTTP API, and the pvc-ansible framework for deploying and bootstrapping a cluster. This release is suitable for general use, though it is still alpha-quality software and should be expected to change significantly until 1.0 is released.
v0.4
Full implementation of virtual management and virtual networking functionality. Partial implementation of storage functionality.
v0.3
Basic implementation of virtual management functionality.