Go to file
Joshua Boniface a6f8500309 Improve fence handling to prevent anomalies
1. Move fence monitoring to its own thread rather than doing the listing
and triggering within the main keepalive thread.
2. Add a global lock key at /config/fence_lock and use this lock key to
prevent multiple nodes from trying to run fences simultaneously.
3. Run the fencing monitor for each node sequentially within the context
of the main fence monitoring thread, to ensure that fences of multiple
nodes happen sequentially rather than in parallel.

All of these should help to prevent any anomalies where one node can try
to fence multiple nodes at once without recourse.
2024-10-10 16:42:57 -04:00
.github Add funding configuration 2021-11-06 18:05:17 -04:00
.hooks Revamp formatting and linting on commit 2021-11-06 13:34:33 -04:00
api-daemon Handle cross-cluster Ceph storage secrets 2024-10-10 00:47:50 -04:00
client-cli Add confirmation option for mirror promote 2024-10-10 01:57:06 -04:00
daemon-common Improve fence handling to prevent anomalies 2024-10-10 16:42:57 -04:00
debian Add support for Gunicorn execution 2024-09-09 13:20:03 -04:00
health-daemon Bump version to 0.9.100 2024-08-30 11:03:33 -04:00
images Add VNC info to screenshots 2023-12-11 03:40:49 -05:00
monitoring Add tag to VMs dashboard 2023-12-29 14:56:24 -05:00
node-daemon Improve fence handling to prevent anomalies 2024-10-10 16:42:57 -04:00
worker-daemon Implement friendlier VM mirror commands 2024-10-08 23:51:39 -04:00
.bbuilder-tasks.yaml Only add packages to bookworm repo 2024-08-30 10:56:24 -04:00
.file-header Update copyright header on all files for 2024 2023-12-29 11:16:59 -05:00
.flake8 Move monitoring folder to top level 2023-12-27 11:37:49 -05:00
.gitignore Add swagger document to gitignore 2023-11-25 15:37:01 -05:00
.version Bump version to 0.9.100 2024-08-30 11:03:33 -04:00
CHANGELOG.md Bump version to 0.9.100 2024-08-30 11:03:33 -04:00
LICENSE Update to new logo 2021-06-29 18:41:36 -04:00
README.md Remove WebUI from README 2023-12-25 02:48:44 -05:00
build-and-deploy.sh Allow API-only builds and deploy 2024-08-25 20:45:52 -04:00
build-stable-deb.sh Use consistent shebangs in scripts 2023-10-17 10:35:38 -04:00
build-unstable-deb.sh Port all Celery worker functions to discrete pkg 2023-11-30 02:24:54 -05:00
bump-version Fix bad file paths 2023-12-01 17:25:12 -05:00
format Add start delineators to command output 2021-11-06 13:35:30 -04:00
gen-api-doc Port all Celery worker functions to discrete pkg 2023-11-30 02:24:54 -05:00
gen-api-migrations Add live migrate max downtime selector meta field 2024-01-11 00:05:50 -05:00
gen-zk-migrations Use consistent shebangs in scripts 2023-10-17 10:35:38 -04:00
lint Add start delineators to command output 2021-11-06 13:35:30 -04:00
pvc.sample.conf Add live migrate max downtime selector meta field 2024-01-11 00:05:50 -05:00
test-cluster.sh Add wait after stopping VM 2023-12-09 18:14:03 -05:00

README.md

Logo banner

License Code style: Black Release Documentation Status

What is PVC?

PVC is a Linux KVM-based hyperconverged infrastructure (HCI) virtualization cluster solution that is fully Free Software, scalable, redundant, self-healing, self-managing, and designed for administrator simplicity. It is an alternative to other HCI solutions such as Ganeti, Harvester, Nutanix, and VMWare, as well as to other common virtualization stacks such as ProxMox and OpenStack.

PVC is a complete HCI solution, built from well-known and well-trusted Free Software tools, to assist an administrator in creating and managing a cluster of servers to run virtual machines, as well as self-managing several important aspects including storage failover, node failure and recovery, virtual machine failure and recovery, and network plumbing. It is designed to act consistently, reliably, and unobtrusively, letting the administrator concentrate on more important things.

PVC is highly scalable. From a minimum (production) node count of 3, up to 12 or more, and supporting many dozens of VMs, PVC scales along with your workload and requirements. Deploy a cluster once and grow it as your needs expand.

As a consequence of its features, PVC makes administrating very high-uptime VMs extremely easy, featuring VM live migration, built-in always-enabled shared storage with transparent multi-node replication, and consistent network plumbing throughout the cluster. Nodes can also be seamlessly removed from or added to service, with zero VM downtime, to facilitate maintenance, upgrades, or other work.

PVC also features an optional, fully customizable VM provisioning framework, designed to automate and simplify VM deployments using custom provisioning profiles, scripts, and CloudInit userdata API support.

Installation of PVC is accomplished by two main components: a Node installer ISO which creates on-demand installer ISOs, and an Ansible role framework to configure, bootstrap, and administrate the nodes. Installation can also be fully automated with a companion cluster bootstrapping system. Once up, the cluster is managed via an HTTP REST API, accessible via a Python Click CLI client or WebUI (eventually).

Just give it physical servers, and it will run your VMs without you having to think about it, all in just an hour or two of setup time.

Getting Started

To get started with PVC, please see the About page for general information about the project, and the Getting Started page for details on configuring your first cluster.

Changelog

View the changelog in CHANGELOG.md. Please note that any breaking changes are announced here; ensure you read the changelog before upgrading!

Screenshots

These screenshots show some of the available functionality of the PVC system and CLI as of PVC v0.9.85.

0. Integrated help
The CLI features an integrated, fully-featured help system to show details about every possible command.

1. Connection management
A single CLI instance can manage multiple clusters, including a quick detail view, and will default to a "local" connection if an "/etc/pvc/pvc.conf" file is found; sensitive API keys are hidden by default.

2. Cluster details and output formats
PVC can show the key details of your cluster at a glance, including health, persistent fault events, and key resources; the CLI can output both in pretty human format and JSON for easier machine parsing in scripts.

3. Node information
PVC can show details about the nodes in the cluster, including their live health and resource utilization.

4. VM information
PVC can show details about the VMs in the cluster, including their state, resource allocations, current hosting node, and metadata.

5. VM details
In addition to the above basic details, PVC can also show extensive information about a running VM's devices and other resource utilization.

6. Network information
PVC has two major client network types, and ensures a consistent configuration of client networks across the entire cluster; managed networks can feature DHCP, DNS, firewall, and other functionality including DHCP reservations.

7. Storage information
PVC provides a convenient abstracted view of the underlying Ceph system and can manage all core aspects of it.

8. VM and node logs
PVC can display logs from VM serial consoles (if properly configured) and nodes in-client to facilitate quick troubleshooting.

9. VM and worker tasks
PVC provides full VM lifecycle management, as well as long-running worker-based commands (in this example, clearing a VM's storage locks).

10. Provisioner
PVC features an extensively customizable and configurable VM provisioner system, including EC2-compatible CloudInit support, allowing you to define flexible VM profiles and provision new VMs with a single command.

11. Prometheus and Grafana dashboard
PVC features several monitoring integration examples under "node-daemon/monitoring", including CheckMK, Munin, and, most recently, Prometheus, including an example Grafana dashboard for cluster monitoring and alerting.