Server management and system administration have changed significantly in the last decade. Computing as a resource is here, and software-defined is the norm. Gone are the days of pet servers, of tweaking configuration files by hand, and of painstakingly installing from ISO images in 52x CD-ROM drives. This is a brave new world.
As part of this trend, the rise of IaaS (Infrastructure as a Service) has created an entirely new way for administrators and, increasingly, developers, to interact with servers. They need to be able to provision virtual machines easily and quickly, to ensure those virtual machines are reliable and consistent, and to avoid downtime wherever possible.
However, the state of the Free Software, virtual management ecosystem in 2019 is quite dissapointing. On the one hand are the giant, IaaS products like OpenStack and CloudStack. These are massive pieces of software, featuring dozens of interlocking parts, designed for massive clusters and public cloud deployments. They're great for a "hyperscale" provider, a large-scale SaaS/IaaS provider, or an enterprise. But they're not designed for small teams or small clusters. On the other hand, tools like Proxmox, oVirt, and even good old fashioned shell scripts are barely scalable, are showing their age, and have become increasily unweildy for advanced usecases - great for one server, not so great for 9 in a highly-available cluster. Not to mention the constant attempts to monitize by throwing features behind Enterprise subscriptions. In short, there is a massive gap between the old-style, pet-based virtualization and the modern, large-scale, IaaS-type virtualization.
PVC aims to bridge this gap. As a Python 3-based, fully-Free Software, scalable, and redundant private "cloud" that isn't afraid to say it's for small clusters, PVC is able to provide the simple, easy-to-use, small cluster you need today, with minimal administrator work, while being able to scale as your system grows, supporting hundreds or thousands of VMs across dozens of nodes. High availability is baked right into the core software, giving you piece of mind about your cluster, and ensuring that your systems keep running no matter what happens. And the interface couldn't be easier - a straightforward Click-based CLI and a Flask-based HTTP API provide access to the cluster for you to manage, either directly or though scripts or WebUIs. And since everything is Free Software, you can always inspect it, customize it to your usecase, add features, and contribute back to the community if you so choose.
PVC provides all the features you'd expect of a "cloud" system - easy management of VMs, including live migration between nodes for maximum uptime; virtual networking support using either vLANs or EVPN-based VXLAN; shared, redundant, object-based storage using Ceph, and a convenient API interface for building your own interfaces. It is able to do this without being excessively complex, and without making sacrifices for legacy ideas.
If you need to run virtual machines, and don't have the time to learn the Stacks, the patience to deal with the old-style FOSS tools, or the money to spend on VMWare or Nutanix, PVC might be just what you're looking for.
A PVC cluster is based around "nodes", which are physical servers on which the various daemons, storage, networks, and virtual machines run. Each node is self-contained; it is able to perform any and all cluster functions if needed, and there are no segmentations of function between different types of physical hosts.
A limited number of nodes, called "coordinators", are statically configured to provide additional services for the cluster. All databases for instance run on the coordinators, but not other nodes. This prevents any issues with scaling database clusters across dozens of hosts, while still retaining maximum redundancy. In a standard configuration, 3 or 5 nodes are designated as coordinators, and additional nodes connect to the coordinators for database access where required. For quorum purposes, there should always be an odd number of coordinators, and exceeding 5 is likely not required even for large clusters.
The primary database for PVC is Zookeeper, which is a highly-available key-value store designed with consistency in mind. Each node connects to the Zookeeper cluster running on the coordinators to send and receive data from the rest of the cluster. The clients also interface with this Zookeeper cluster to configure and obtain state about the various objects in the cluster. This database is the central authority for all nodes.
Nodes are networked together via at least 3 different networks, set during bootstrap. The first is the "upstream" network, which provides upstream access for the nodes, for instance Internet connectivity, sending routes to client networks to upstream routers, etc. This should usually be a private/firewalled network to prevent unauthorized access to the cluster. The second is the "cluster" network, which is a private RFC1918 network that is unrouted and that nodes use to communicate between one another for Zookeeper access, Libvirt migrations, EVPN VXLAN tunnels, etc. The third is the "storage" network, which is used by the Ceph storage cluster for inter-OSD communication, allowing it to be separate from the main cluster network for maximum performance flexibility.
Within each node, the PVC daemon is a single Python 3 program which handles all node functionality, including networking, starting cluster services, managing creation/removal of VMs, networks, and storage, and providing utilization statistics and information to the cluster.
The daemon uses an object-oriented approach, with most cluster objects being represented by class objects of a specific type. Each node has a full view of all cluster objects and can interact with them based on events from the cluster as needed.
The CLI client is the most basic interface to the PVC cluster. It is a Python 3 Click application which interfaces directly with the Zookeeper cluster and provides a self-documenting CLI interface for PVC.
The HTTP API client is a more advanced interface to the PVC cluster, suitable for creating custom interfaces for PVC and providing better access control than the CLI. It is a Python 3 Flask application which also interfaces directly with the Zookeper cluster, and provides services on port 7370 by default. The API features a basic, key-based authentication mechanism to prevent unauthorized access, though this is optional, and can also provide HTTPS support if required for maximum security over public networks. With the exception of cluster initialization, the API can perform all functions that the CLI client can using a similar syntax and layout. Requests return JSON, and POST requests generally expect HTTP form responses.
Both the CLI and API clients use a common set of functions to perform the actual communication with the cluster, which is packaged separately as the `pvc-client-common` package. These functions can be used directly by 3rd-party Python interfaces for PVC if desired.
The overall management, deployment, bootstrapping, and configuring of nodes is accomplished via a set of Ansible roles, found in the [`pvc-ansible` repository](https://github.com/parallelvirtualcluster/pvc-ansible), and nodes are installed via a custom installer ISO generated by the [`pvc-installer` repository](https://github.com/parallelvirtualcluster/pvc-installer). Once the cluster is set up, nodes can be added, replaced, or updated using this Ansible framework.
PVC is written by Joshua M. Boniface. A Linux system administrator by trade, Joshua is always looking for the best solutions to his user's problems, be they developers or end users. PVC grew out of his frustration with the various FOSS virtualization tools, as well as and specifically, the constant failures of Pacemaker/Corosync to gracefully manage a virtualization cluster. He started work on PVC at the end of May 2018 and has been growing the feature set, as well as learning ever more about Python development, ever since.